Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
github.com/googlecloudplatform/dataflowtemplates
These Dataflow templates are an effort to solve simple, but large, in-Cloud data tasks, including data import/export/backup/restore and bulk API operations, without a development environment. The technology under the hood which makes these operations possible is the Google Cloud Dataflow service combined with a set of Apache Beam SDK templated pipelines.
Google is providing this collection of pre-implemented Dataflow templates as a reference and to provide easy customization for developers wanting to extend their functionality.
As of November 18, 2021, our default branch is now named main
. This does not
affect forks. If you would like your fork and its local clone to reflect these
changes you can
follow GitHub's branch renaming guide.
For documentation on each template's usage and parameters, please see the official docs.
User-defined functions (UDFs) allow you to customize a template's functionality by providing a short JavaScript function without having to maintain the entire codebase. This is useful in situations which you'd like to rename fields, filter values, or even transform data formats before output to the destination. All UDFs are executed by providing the payload of the element as a string to the JavaScript function. You can then use JavaScript's in-built JSON parser or other system functions to transform the data prior to the pipeline's output. The return statement of a UDF specifies the payload to pass forward in the pipeline. This should always return a string value. If no value is returned or the function returns undefined, the incoming record will be filtered from the output.
Template | UDF Input Type | Input Description | UDF Output Type | Output Description |
---|---|---|---|---|
Datastore Bulk Delete | String | A JSON string of the entity | String | A JSON string of the entity to delete; filter entities by returning undefined |
Datastore to Pub/Sub | String | A JSON string of the entity | String | The payload to publish to Pub/Sub |
Datastore to GCS Text | String | A JSON string of the entity | String | A single-line within the output file |
GCS Text to BigQuery | String | A single-line within the input file | String | A JSON string which matches the destination table's schema |
Pub/Sub to BigQuery | String | A string representation of the incoming payload | String | A JSON string which matches the destination table's schema |
Pub/Sub to Datastore | String | A string representation of the incoming payload | String | A JSON string of the entity to write to Datastore |
Pub/Sub to Splunk | String | A string representation of the incoming payload | String | The event data to be sent to Splunk HEC events endpoint. Must be a string or a stringified JSON object |
For a comprehensive list of samples, please check our udf-samples folder.
/**
* A transform which adds a field to the incoming data.
* @param {string} inJson
* @return {string} outJson
*/
function transform(inJson) {
var obj = JSON.parse(inJson);
obj.dataFeed = "Real-time Transactions";
obj.dataSource = "POS";
return JSON.stringify(obj);
}
/**
* A transform function which only accepts 42 as the answer to life.
* @param {string} inJson
* @return {string} outJson
*/
function transform(inJson) {
var obj = JSON.parse(inJson);
// only output objects which have an answer to life of 42.
if (obj.hasOwnProperty('answerToLife') && obj.answerToLife === 42) {
return JSON.stringify(obj);
}
}
To contribute to the repository, see CONTRIBUTING.md.
Templates are released in a weekly basis (best-effort) as part of the efforts to keep Google-provided Templates updated with latest fixes and improvements.
To learn more about this process, or how you can stage your own changes, see Release Process.
FAQs
Unknown package
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.