You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 7-8.RSVP
Socket
Socket
Sign inDemoInstall

ru.napalabs.spark:spark-hscan_2.11

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ru.napalabs.spark:spark-hscan_2.11

hyperscan wrapper for spark to allow matching large numbers (up to tens of thousands) of regular expressions


Version published
Maintainers
1

Readme

Source

spark-hscan

This project is hyperscan wrapper for spark to allow matching large numbers (up to tens of thousands) of regular expressions

Add it to your project

Spark shell

spark-shell --packages ru.napalabs.spark:spark-hscan_2.11:0.1

SBT

libraryDependencies += "ru.napalabs.spark" % "spark-hscan_2.11" % "0.1"

Usage examples

SparkSQL

import ru.napalabs.spark.hscan.implicits._

spark.registerHyperscanFuncs()

val df = spark.sql("""
select * from my_table 
    where hlike(text_field, array("pattern.*", "[a-Z]+other"))"""
)

Scala DSL:

import ru.napalabs.spark.hscan.functions._

val df = spark.read
         .format("parquet")
         .load("/path/to/files")
df.where(hlike($"text_col"), Array("pattern.*", "[a-Z]+other"))

Benchmark

As a benchmark we used hsbench (teakettle_2500 pattern set and alexa200.db dataset) benchmark results

Limitations

See limitations in hyperscan-java project.

Also, this project is only with spark 2.3.2. Compatibility with other versions of spark is not guaranteed.

Contributing

Feel free to raise issues or submit a pull request.

License

Apache License, Version 2.0

FAQs

Package last updated on 19 Oct 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc