Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

github.com/steakknife/bloomfilter

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/steakknife/bloomfilter

  • v0.0.0-20180922174646-6819c0d2a570
  • Source
  • Go
  • Socket score

Version published
Created
Source

Important: Zeroth, consider if a Cuckoo filter could be right for your use-case.

GoDoc travis

Face-meltingly fast, thread-safe, marshalable, unionable, probability- and optimal-size-calculating Bloom filter in go

Copyright © 2014-2016,2018 Barry Allard

MIT license

WTF is a bloom filter

**TL;DR: **Probabilistic, extra lookup table to track a set of elements kept elsewhere to reduce expensive, unnecessary set element retrieval and/or iterator operations when an element is not present in the set. It's a classic time-storage tradeoff algoritm.

Properties

See wikipedia for algorithm details
ImpactWhatDescription
GoodNo false negativesknow for certain if a given element is definitely NOT in the set
BadFalse positivesuncertain if a given element is in the set
BadTheoretical potential for hash collisionsin very large systems and/or badly hash.Hash64-conforming implementations
BadAdd onlyCannot remove an element, it would destroy information about other elements
GoodConstant storageuses only a fixed amount of memory

Naming conventions

(Similar to algorithm)

Variable/functionDescriptionRange
m/M()number of bits in the bloom filter (memory representation is about m/8 bytes in size)>=2
n/N()number of elements present>=0
k/K()number of keys to use (keys are kept private to user code but are de/serialized to Marshal and file I/O)>=0
maxNmaximum capacity of intended structure>0
pmaximum allowed probability of collision (for computing m and k for optimal sizing)>0..<1
  • Memory representation should be exactly 24 + 8*(k + (m+63)/64) + unsafe.Sizeof(RWMutex) bytes.
  • Serialized (BinaryMarshaler) representation should be exactly 72 + 8*(k + (m+63)/64) bytes. (Disk format is less due to compression.)

Binary serialization format

All values in Little-endian format

OffsetOffset (Hex)Length (bytes)NameType
0008kuint64
8088nuint64
16108muint64
2418k(keys)[k]uint64
24+8*k...(m+63)/64(bloom filter)[(m+63)/64]uint64
24+8*k+8*((m+63)/64)...48(SHA384 of all previous fields, hashed in order)[48]byte
  • bloomfilter.Filter conforms to encoding.BinaryMarshaler and `encoding.BinaryUnmarshaler'

Usage


import "github.com/steakknife/bloomfilter"

const (
  maxElements = 100000
  probCollide = 0.0000001
)

bf, err := bloomfilter.NewOptimal(maxElements, probCollide)
if err != nil {
  panic(err)
}

someValue := ... // must conform to hash.Hash64

bf.Add(someValue)
if bf.Contains(someValue) { // probably true, could be false
  // whatever
}

anotherValue := ... // must also conform to hash.Hash64

if bf.Contains(anotherValue) {
  panic("This should never happen")
}

err := bf.WriteFile("1.bf.gz")  // saves this BF to a file
if err != nil {
  panic(err)
}

bf2, err := bloomfilter.ReadFile("1.bf.gz") // read the BF to another var
if err != nil {
  panic(err)
}

Design

Where possible, branch-free operations are used to avoid deep pipeline / execution unit stalls on branch-misses.

Get

go get -u github.com/steakknife/bloomfilter  # master is always stable

Source

Contact

License

MIT license

Copyright © 2014-2016 Barry Allard

FAQs

Package last updated on 22 Sep 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc