Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Fury is a blazing fast multi-language serialization framework powered by jit, vectorization and zero-copy
Fury is a blazing-fast multi-language serialization framework powered by jit(just-in-time compilation) and zero-copy, providing up to 170x performance and ultimate ease of use.
In addition to cross-language serialization, Fury also features at:
writeObject/readObject/writeReplace/readResolve/readObjectNoData/Externalizable
API.record
is supported too.Different scenarios have different serialization requirements. Fury designed and implemented multiple binary protocols for those requirements:
New protocols can be easily added based on fury existing buffer, encoding, meta, codegen and other capabilities. All of those share the same codebase, and the optimization for one protocol can be reused by another protocol.
Different serialization frameworks are suitable for different scenarios, and benchmark results here are for reference only.
If you need to benchmark for your specific scenario, make sure all serialization frameworks are appropriately configured for that scenario.
Dynamic serialization frameworks support polymorphism and references, but they often come with a higher cost compared to static serialization frameworks, unless they utilize JIT techniques like Fury does. Because Fury generates code at runtime, it is recommended to warm up the system before collecting benchmark statistics.
Title containing "compatible" represent schema compatible mode: support type forward/backward compatibility.
Title without "compatible" represent schema consistent mode: class schema must be the same between serialization and deserialization.
Struct
is a class with 100 primitive fields, MediaContent
is a class from jvm-serializers, Sample
is a class from kryo benchmark.
See benchmarks for more benchmarks about type forward/backward compatibility, off-heap support, zero-copy serialization.
Nightly snapshot:
<repositories>
<repository>
<id>sonatype</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<dependency>
<groupId>org.furyio</groupId>
<artifactId>fury-core</artifactId>
<version>0.5.0-SNAPSHOT</version>
</dependency>
<!-- row/arrow format support -->
<!-- <dependency>
<groupId>org.furyio</groupId>
<artifactId>fury-format</artifactId>
<version>0.5.0-SNAPSHOT</version>
</dependency> -->
Release version:
<dependency>
<groupId>org.furyio</groupId>
<artifactId>fury-core</artifactId>
<version>0.4.1</version>
</dependency>
<!-- row/arrow format support -->
<!-- <dependency>
<groupId>org.furyio</groupId>
<artifactId>fury-format</artifactId>
<version>0.4.1</version>
</dependency> -->
libraryDependencies += "org.furyio" % "fury-core" % "0.4.1"
# Release version will be provided in the future.
pip install pyfury --pre
npm install @furyjs/fury
go get github.com/alipay/fury/go/fury
Here we give a quick start about how to use fury, see user guide for more details about java, cross language, and row format.
If you don't have cross-language requirements, using this mode will have better performance.
import io.fury.*;
import io.fury.config.*;
import java.util.*;
public class Example {
public static void main(String[] args) {
SomeClass object = new SomeClass();
// Note that Fury instances should be reused between
// multiple serializations of different objects.
{
Fury fury = Fury.builder().withLanguage(Language.JAVA)
// Allow to deserialize objects unknown types, more flexible
// but may be insecure if the classes contains malicious code.
.requireClassRegistration(false)
.build();
// Registering types can reduce class name serialization overhead, but not mandatory.
// If class registration enabled, all custom types must be registered.
fury.register(SomeClass.class);
byte[] bytes = fury.serialize(object);
System.out.println(fury.deserialize(bytes));
}
{
ThreadSafeFury fury = Fury.builder().withLanguage(Language.JAVA)
// Allow to deserialize objects unknown types, more flexible
// but may be insecure if the classes contains malicious code.
.requireClassRegistration(false)
.buildThreadSafeFury();
byte[] bytes = fury.serialize(object);
System.out.println(fury.deserialize(bytes));
}
{
ThreadSafeFury fury = new ThreadLocalFury(classLoader -> {
Fury f = Fury.builder().withLanguage(Language.JAVA)
.withClassLoader(classLoader).build();
f.register(SomeClass.class);
return f;
});
byte[] bytes = fury.serialize(object);
System.out.println(fury.deserialize(bytes));
}
}
}
Java
import io.fury.*;
import io.fury.config.*;
import java.util.*;
public class ReferenceExample {
public static class SomeClass {
SomeClass f1;
Map<String, String> f2;
Map<String, String> f3;
}
public static Object createObject() {
SomeClass obj = new SomeClass();
obj.f1 = obj;
obj.f2 = ofHashMap("k1", "v1", "k2", "v2");
obj.f3 = obj.f2;
return obj;
}
// mvn exec:java -Dexec.mainClass="io.fury.examples.ReferenceExample"
public static void main(String[] args) {
Fury fury = Fury.builder().withLanguage(Language.XLANG)
.withRefTracking(true).build();
fury.register(SomeClass.class, "example.SomeClass");
byte[] bytes = fury.serialize(createObject());
// bytes can be data serialized by other languages.
System.out.println(fury.deserialize(bytes));
}
}
Python
from typing import Dict
import pyfury
class SomeClass:
f1: "SomeClass"
f2: Dict[str, str]
f3: Dict[str, str]
fury = pyfury.Fury(ref_tracking=True)
fury.register_class(SomeClass, type_tag="example.SomeClass")
obj = SomeClass()
obj.f2 = {"k1": "v1", "k2": "v2"}
obj.f1, obj.f3 = obj, obj.f2
data = fury.serialize(obj)
# bytes can be data serialized by other languages.
print(fury.deserialize(data))
Golang
package main
import furygo "github.com/alipay/fury/go/fury"
import "fmt"
func main() {
type SomeClass struct {
F1 *SomeClass
F2 map[string]string
F3 map[string]string
}
fury := furygo.NewFury(true)
if err := fury.RegisterTagType("example.SomeClass", SomeClass{}); err != nil {
panic(err)
}
value := &SomeClass{F2: map[string]string{"k1": "v1", "k2": "v2"}}
value.F3 = value.F2
value.F1 = value
bytes, err := fury.Marshal(value)
if err != nil {
}
var newValue interface{}
// bytes can be data serialized by other languages.
if err := fury.Unmarshal(bytes, &newValue); err != nil {
panic(err)
}
fmt.Println(newValue)
}
public class Bar {
String f1;
List<Long> f2;
}
public class Foo {
int f1;
List<Integer> f2;
Map<String, Integer> f3;
List<Bar> f4;
}
RowEncoder<Foo> encoder = Encoders.bean(Foo.class);
Foo foo = new Foo();
foo.f1 = 10;
foo.f2 = IntStream.range(0, 1000000).boxed().collect(Collectors.toList());
foo.f3 = IntStream.range(0, 1000000).boxed().collect(Collectors.toMap(i -> "k"+i, i->i));
List<Bar> bars = new ArrayList<>(1000000);
for (int i = 0; i < 1000000; i++) {
Bar bar = new Bar();
bar.f1 = "s"+i;
bar.f2 = LongStream.range(0, 10).boxed().collect(Collectors.toList());
bars.add(bar);
}
foo.f4 = bars;
// Can be zero-copy read by python
BinaryRow binaryRow = encoder.toRow(foo);
// can be data from python
Foo newFoo = encoder.fromRow(binaryRow);
// zero-copy read List<Integer> f2
BinaryArray binaryArray2 = binaryRow.getArray(1);
// zero-copy read List<Bar> f4
BinaryArray binaryArray4 = binaryRow.getArray(3);
// zero-copy read 11th element of `readList<Bar> f4`
BinaryRow barStruct = binaryArray4.getStruct(10);
// zero-copy read 6th of f2 of 11th element of `readList<Bar> f4`
barStruct.getArray(1).getLong(5);
RowEncoder<Bar> barEncoder = Encoders.bean(Bar.class);
// deserialize part of data.
Bar newBar = barEncoder.fromRow(barStruct);
Bar newBar2 = barEncoder.fromRow(binaryArray4.getStruct(20));
@dataclass
class Bar:
f1: str
f2: List[pa.int64]
@dataclass
class Foo:
f1: pa.int32
f2: List[pa.int32]
f3: Dict[str, pa.int32]
f4: List[Bar]
encoder = pyfury.encoder(Foo)
foo = Foo(f1=10, f2=list(range(1000_000)),
f3={f"k{i}": i for i in range(1000_000)},
f4=[Bar(f1=f"s{i}", f2=list(range(10))) for i in range(1000_000)])
binary: bytes = encoder.to_row(foo).to_bytes()
foo_row = pyfury.RowData(encoder.schema, binary)
print(foo_row.f2[100000], foo_row.f4[100000].f1, foo_row.f4[200000].f2[5])
Fury java object graph serialization support class schema forward/backward compatibility. The serialization peer and deserialization peer can add/delete fields independently.
We plan to add support cross-language serialization after meta compression is finished.
We are still improving our protocols, binary compatibility is not ensured between fury major releases for now.
it's ensured between minor versions only. Please
versioning
your data by fury major version if you will upgrade fury in the future, see how to upgrade fury for further details.
Binary compatibility will be ensured when fury 1.0 is released.
Static serialization is secure. But dynamic serialization such as fury java/python native serialization supports deserializing unregistered types, which provides more dynamics and flexibility, but also introduce security risks.
For example, the deserialization may invoke init
constructor or equals
/hashCode
method, if the method body contains malicious code, the system will be at risk.
Fury provides a class registration option that is enabled by default for such protocols, allowing only deserialization of trusted registered types or built-in types. Do not disable class registration unless you can ensure your environment is secure.
If this option is disabled, you are responsible for serialization security. You can configure io.fury.resolver.ClassChecker
by
ClassResolver#setClassChecker
to control which classes are allowed for serialization.
Please read the BUILD guide for instructions on how to build.
Please read the CONTRIBUTING guide for instructions on how to contribute.
For ecosystem projects, please see https://github.com/fury-project
Licensed under the Apache License, Version 2.0
FAQs
Fury is a blazing fast multi-language serialization framework powered by jit, vectorization and zero-copy
We found that pyfury demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.