CSV to JSON

github.com/tmaiaroto/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v2.0.4+incompatible • 8 years ago

github.com/cromon/ratchet2

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v0.0.0-20210312211124-23619b44cb6a • 4 years ago

github.com/tsaserv/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v3.0.2+incompatible • 3 years ago

github.com/kjk7212/ratchet

Ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v1.5.2 • 2 years ago

github.com/mcswitch/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v2.0.8+incompatible • 6 years ago

github.com/tesla-sca/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v1.0.0 • 3 years ago

github.com/likearthian/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v2.0.4+incompatible • 8 years ago

github.com/subzerodevelopment/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v2.0.4+incompatible • 8 years ago

github.com/synecdoche-se/ratchet3

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v0.0.0-20220712112451-ae77cf6993f7 • 2 years ago

github.com/caltechlibrary/springytools

expected.go is a set of testing functions Author: R. S. Doiel <rsdoiel@caltech.edu> Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. extractors.go provides funcs for processing text and pulling out elements like URL links. Author: R. S. Doiel <rsdoiel@caltech.edu> Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. libguides.go implements the data structures for working with with LibGuides exported XML. Author: R. S. Doiel <rsdoiel@caltech.edu> Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. reports.go provides the functions that work in an input filename and output filename generating reports or data conversions. Author: R. S. Doiel <rsdoiel@caltech.edu> Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. tables.go provides a XML, JSON and CSV rendering of the Table datastructure. Author: R. S. Doiel <rsdoiel@caltech.edu> Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

v0.0.3 • 3 years ago

github.com/licaonfee/ratchet

Package ratchet is a library for performing data pipeline / ETL tasks in Go. The main construct in Ratchet is Pipeline. A Pipeline has a series of PipelineStages, which will each perform some type of data processing, and then send new data on to the next stage. Each PipelineStage consists of one or more DataProcessors, which are responsible for receiving, processing, and then sending data on to the next stage of processing. DataProcessors each run in their own goroutine, and therefore all data processing can be executing concurrently. Here is a conceptual drawing of a fairly simple Pipeline: In this example, we have a Pipeline consisting of 3 PipelineStages. The first stage has a DataProcessor that runs queries on a SQL database, the second is doing custom transformation work on that data, and the third stage branches into 2 DataProcessors, one writing the resulting data to a CSV file, and the other inserting into another SQL database. In the example above, Stage 1 and Stage 3 are using built-in DataProcessors (see the "processors" package/subdirectory). However, Stage 2 is using a custom implementation of DataProcessor. By using a combination of built-in processors, and supporting the writing of any Go code to process data, Ratchet makes it possible to write very custom and fast data pipeline systems. See the DataProcessor documentation to learn more. Since each DataProcessor is running in it's own goroutine, SQLReader can continue pulling and sending data while each subsequent stage is also processing data. Optimally-designed pipelines have processors that can each run in an isolated fashion, processing data without having to worry about what's coming next down the pipeline. All data payloads sent between DataProcessors are of type data.JSON ([]byte). This provides a good balance of consistency and flexibility. See the "data" package for details and helper functions for dealing with data.JSON. Another good read for handling JSON data in Go is http://blog.golang.org/json-and-go. Note that many of the concepts in Ratchet were taken from the Golang blog's post on pipelines (http://blog.golang.org/pipelines). While the details discussed in that blog post are largely abstracted away by Ratchet, it is still an interesting read and will help explain the general concepts being applied. There are two ways to construct and run a Pipeline. The first is a basic, non-branching Pipeline. For example: This is a 3-stage Pipeline that queries some SQL data in stage 1, does some custom data transformation in stage 2, and then writes the resulting data to a SQL table in stage 3. The code to create and run this basic Pipeline would look something like: The second way to construct a Pipeline is using a PipelineLayout. This method allows for more complex Pipeline configurations that support branching between stages that are running multiple DataProcessors. Here is a (fairly complex) example: This Pipeline consists of 4 stages where each DataProcessor is choosing which DataProcessors in the subsequent stage should receive the data it sends. The SQLReader in stage 2, for example, is sending data to only 2 processors in the next stage, while the Custom DataProcessor in stage 2 is sending it's data to 3. The code for constructing and running a Pipeline like this would look like: This example is only conceptual, the main points being to explain the flexibility you have when designing your Pipeline's layout and to demonstrate the syntax for constructing a new PipelineLayout.

v2.0.4+incompatible • 8 years ago

github.com/b-2019-apt-test/divider

github.com/ttys3/json2csv

github.com/tmaiaroto/ratchet

github.com/cromon/ratchet2

github.com/suddani/json_to_csv

github.com/murilobsd/cptec

github.com/Belphemur/vue-json-csv

github.com/bool64/dbdog

github.com/dkozniewski/csv_to_json

github.com/jimmykodes/csv-to-json

github.com/felipeaz/azuread-csv-to-json

github.com/tsaserv/ratchet

github.com/janapc/convert-csv-json

github.com/johnandrade18/convertir-csv-a-json

github.com/jyyan/freegeoip

github.com/jmurowaniecki/freegeoip

github.com/kjk7212/ratchet

github.com/kenjitheman/json-to-csv

github.com/selinjodhani/csv_to_json

github.com/infinitusai/json2csv

github.com/yaslama/freegeoip

github.com/mcswitch/ratchet

github.com/tesla-sca/ratchet

github.com/likearthian/ratchet

github.com/subzerodevelopment/ratchet

github.com/montanaflynn/freegeoip

github.com/synecdoche-se/ratchet3

github.com/caltechlibrary/springytools

github.com/licaonfee/ratchet

github.com/jgsheppa/csv-to-json

github.com/himanshu007-creator/csv-to-json-generator

github.com/easonlai/go_json_to_csv

github.com/giorgiabzianidzese/csv_to_json_converter-golang

github.com/myENA/consul-decoder

github.com/codytheroux96/csv-to-json

github.com/rockwotj/redpanda/src/transform-sdk/go/transform/v24

github.com/godogx/dbsteps

github.com/rockwotj/redpanda/src/go/sdk

github.com/richard-lyman/harbor

github.com/esonhugh/sculptor

github.com/afghanistanyn/json2csv

github.com/neilotoole/streamcache

github.com/dextor19/downtable

github.com/xnslong/cli-tools/csv_to_json

github.com/zspishere/json2csv