Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
org.polystat.odin:parser_3
Advanced tools
Odin (object dependency inspector) — static analyzer for EO source code that detects OOP-related bugs.
Odin (object dependency inspector) — a static analyzer for EO programming language.
Odin is used in polystat and polystat-cli.
(in-depth documentation for each defect can be found in the subsequent sections)
As of now, ODIN supports the following defect types:
Mutual recursion caused by method redefinition during inheritance.
Sample input:
[] > a
b > @
[self] > f
self > @
[] > b
[] > d
a > @
[self] > g
self > @
[] > c
[] > e
a > @
[self] > h
self.f self > @
[] > t
c.e > @
[self] > f
self.h self > @
Sample output:
t:
t.h (was last redefined in "c.e") ->
t.f ->
t.h (was last redefined in "c.e")
Assumptions made in subclasses regarding method dependencies in superclasses.
Sample input:
[] > base
[self x] > f
seq > @
assert (x.less 9)
x.add 1
[self x] > g
seq > @
assert ((self.f self (x.add 1)).less 10)
22
[] > derived
base > @
[self x] > f
seq > @
assert (5.less x)
x.sub 1
Sample output:
Method g is not referentially transparent
Altering the state stored in the base class in undesirable ways.
Sample input:
[] > a
memory > state
[self new_state] > update_state
self.state.write new_state > @
[] > b
a > @
[self new_state] > change_state_plus_two
self.state.write (new_state.add 2) > @
Sample output:
Method 'change_state_plus_two' of object 'b' directly accesses state 'state' of base class 'a'
Ability to use subclasses in place of superclasses.
Sample input:
[] > base
[self x] > f
seq > @
assert (x.less 9)
x.add 1
[] > derived
base > @
[self x] > f
seq > @
assert (x.greater 9)
x.sub 1
Sample output:
Method f of object derived violates the Liskov substitution principle as compared to version in parent object base
This analyzer is capable of detecting mutual recursion caused by method redefinition during inheritance.
Only the following patterns are recognized by the algorithm:
[] > name
{0 or 1 decoratee}
{>= 0 method declarations}
{>= 0 object declarations}
simple_name > @
or
an.attribute.of.some.object > @
[self {>= 0 params}] > methodName
{>= 0 method calls} > @
self.methodName self {>= 0 params}
The algorithm supports only two kinds of decorations: simple applications, like fruit > @
, and attribute decorations,
like fruits.citrus.orange > @
. All the other patterns are ignored, thus an object containing such pattern is
considered to not have a @
attribute at all!
For example, in this program:
[] > baobab
1.add 2 > @
[self] > stall
the statement 1.add 2 > @
is ignored so, the algorithm 'sees' this program like so:
[] > baobab
[self] > stall
The algorithm takes into consideration every possible method call that is present in the body, whether it is actually
called does not matter. For example, in method zen
:
[self] > zen
if. > @
false
self.no self
self.yes self
The algorithm will consider both no
and yes
method calls, even though no
is never called.
The same goes for unused local definitions. In EO, "method" is an object containing a special @
(phi) attribute. When
the method is called:
self.method self
it is the object labelled by @
that gets executed. Any other local bindings that are not used in the binding
associated with @
are ignored. For example:
[self] > m
fib 100000 > huge_computation!
1.add 1 > @
Only 1.add 1
will be computed when method m
is called. The huge_computatuion
will never be performed (since it is
never called in the declaration of @
).
That being said, if huge_computation
is a method call, say:
[self] > m
# a valid method call now!
self.fib self 100000 > huge_computation!
1.add 1 > @
The algorithm would still consider it as "called". In other words, it will be a part of the analyzer output if it is a part of some mutual recursion chain.
Furthermore, the algorithm does not run any checks for the presence of @
attribute. As a result, a method that would
produce a runtime error when called, like:
[self] > m
self.fib self 100000 > huge_computation!
# this is no longer `@`
1.add 1 > two
is still considered by the algorithm.
Even though the following code is perfectly valid in EO:
[] > a
b > @
[] > b
a > @
This code (and the similar variations) will cause the analyzer to fail with StackOverflowError
.
Parse the source code to build its AST.
Create a tree of partial objects that preserves the object nestedness. Basically, collect all the information that can be collect within a single pass.
This tree is then refined to create a complete representation of the program, with method redefinition.
Traverse the callgraphs finding cycles that involve methods coming from multiple objects. For example, the callchain a.f -> a.g -> a.f will not be in the output, meanwhile the callchain a.f -> b.g -> a.f will be a part of it.
The analyzer can fail with the following errors:
[] > a
b > @
Exception in thread "main" java.lang.Exception: Parent (or decoratee) object "b" of object "a" is specified, but not defined in the program!
.
[] > a
[self] > f
self.non_existent self > @
Exception in thread "main" java.lang.Exception: Method "non_existent" was called from the object "a", although it is not defined there!
.
For this type of defect we detect whether the refactoring of superclasses by inlining can affect the functionality of subclasses in an undesired way. The notion of unjustified assumption in this context refers to assumptions made in subclasses regarding method dependencies in superclasses.
The following stages and steps take place during program analysis:
self.method self ...
. Meaning that self
is used as both the source for the method and
is passed as the first argument.Before:
[] > obj
[self] > method
self > @
[method] > shadowedMethod
method > @
[] > notShadowedMethod
method > @
[] > notShadowedObj
obj > @
[] > shadowedObj
[obj] > method
[] > innerMethod
obj > @
obj > @
[] > outer
[] > self
256 > magic
[] > dummy
[outer] > dummyMethod
outer > @
outer.self > @
self "yahoo" > @
[self] > method
self.magic > @
After:
[] > obj
[self] > method
$.self > @
[method] > shadowedMethod
$.method > @
[] > notShadowedMethod
^.method > @
[] > notShadowedObj
^.obj > @
[] > shadowedObj
[obj] > method
[] > innerMethod
^.obj > @
$.obj > @
[] > outer
[] > self
256 > magic
[] > dummy
[outer] > dummyMethod
$.outer > @
^.^.^.outer.self > @
^.self > @
"yahoo"
[self] > method
$.self.magic > @
@
attribute, extract them into the local-attrs
object and put this object adjacent to the
call-site. 2.1.3. References to local attributes are replaced with their counterparts from the local-attrs
object.
2.1.4. The call is replaced with the value of the called method's @
attribute.Before:
[] > obj
[self arg1 arg2 arg3] > average3
arg1.add (arg2.add arg3) > sum
3 > count
sum.div count > average
[] > @
^.sum > sum
^.count > count
^.average > average
[self] > call-site
self.average3 self 1 2 3 > @
After:
[] > obj
[self arg1 arg2 arg3] > average3
$.arg1.add > sum
$.arg2.add
$.arg3
3 > count
$.sum.div > average
$.count
[] > @
^.sum > sum
^.count > count
^.average > average
[self] > call-site
[] > local_average3
1.add > sum
2.add
3
3 > count
$.sum.div > average
$.count
[] > @
^.local_average3.sum > sum
^.local_average3.count > count
^.local_average3.average > average
assert(logic-before-inlining => logic-after-inlining)
.(check-sat)
and (get-model)
commands and sent the SMT-solver
for further processing.Program:
[] > test
[] > parent
[self x] > f
x.sub 5 > y1
seq > @
assert (0.less y1)
x
[self y] > g
self.f self y > @
[self z] > h
z > @
[] > child
parent > @
[self y] > f
y > @
[self z] > h
self.g self z > @
Resulting SMT-code:
(define-fun value-of-before-f ((var-before-y Int)) Int var-before-y)
(define-fun properties-of-before-f ((var-before-y Int)) Bool (and true true))
(define-fun value-of-before-g ((var-before-y Int)) Int (value-of-before-f var-before-y))
(define-fun properties-of-before-g ((var-before-y Int)) Bool (and (properties-of-before-f var-before-y) true))
(define-fun value-of-before-h ((var-before-z Int)) Int (value-of-before-g var-before-z))
(define-fun properties-of-before-h ((var-before-z Int)) Bool (and (properties-of-before-g var-before-z) true))
(define-fun value-of-after-f ((var-after-y Int)) Int var-after-y)
(define-fun properties-of-after-f ((var-after-y Int)) Bool (and true true))
(define-fun value-of-after-g ((var-after-y Int)) Int var-after-y)
(define-fun properties-of-after-g ((var-after-y Int)) Bool (exists ((var-after-local_f Int)(var-after-local_f-y1 Int)) (and (and (and (and true true) (< 0 var-after-local_f-y1)) true) (and (and (and true true) (= var-after-local_f-y1 (- var-after-y 5))) true))))
(define-fun value-of-after-h ((var-after-z Int)) Int (value-of-after-f var-after-z))
(define-fun properties-of-after-h ((var-after-z Int)) Bool (and (properties-of-after-f var-after-z) true))
(assert (forall ((var-before-y Int)) (exists ((var-after-y Int)) (and (and true (= var-before-y var-after-y)) (=> (and (properties-of-before-f var-before-y) true) (exists ((var-after-local_f Int)(var-after-local_f-y1 Int)) (and (and (and (and true true) (< 0 var-after-local_f-y1)) true) (and (and (and true true) (= var-after-local_f-y1 (- var-after-y 5))) true))))))))
(check-sat)
(get-model)
assert (3.add 5)
.The error messages that the analyser can fail with at each stage are listed here.
[] > a
[self] > method
b > @
Exception in thread "main" java.lang.Exception: Could not set locator for non-existent object with name b
1 < non-existent
[] > a
non-existent > @
java.lang.Exception: There is no such parent with name "^.non-existent"
[] > b
[] > a
^.^.b > @
java.lang.Exception: Locator overshoot at ^.^.b
[] > a
[self] > f
self.non-existent self > @
java.lang.Exception: Inliner: attempt to call non-existent method "non-existent"
[] > a
[self a b] > f
a.add b > @
[self] > g
self.f self 1 > @
[self] > h
self.f self 1 2 3 > @
java.lang.Exception: Wrong number of arguments given for method f. Wrong number of arguments given for method f.
[self x] > f
22 > local-def
[void] > local-obj
Exception in thread "main" java.lang.Exception: object with void attributes are not supported yet!
.
[] > parent
[self x] > f
(self.g 10).@ > res
10 > @
[self y] > g
y.add 1 > @
Exception in thread "main" java.lang.Exception: Cannot analyze dot of app: (self.g 10).@
.
[self] > method
some-unsupported-primitive 10 > @
Exception in thread "main" java.lang.Exception: Unsupported 1-ary primitive some-unsupported-primitive
.
[self x] > method
x.some-unsupported-primitive 10 > @
Exception in thread "main" java.lang.Exception: Unsupported 1-ary primitive .some-unsupported-primitive
.
[] > parent
[self x] > f
(self.g 10).@ > res
10 > @
[self y] > g
y.add 1 > @
Exception in thread "main" java.lang.Exception: Methods with no arguments are not supported
.
Exception in thread "main" java.lang.Exception: SMT solver failed with error: ??? </span>.
Exception in thread "main" java.lang.Exception: SMT failed to parse the generated program with error: ???
.
The fourth defect type suported by odin.
An extension class should not access the state of its base
class directly, but only through calling base class methods.
Given the class C
with methods m
and n
:
C = class
x : int := 0; y : int := 0
m => y := y + 1; x := y
n => y := y + 2; x := y
end
And a potential modification M
that affects method n
:
M = modifier
n => x:= x + 2
end
We will have the following: the modification M
, if applied to class C
might cause unnecessary confusion. The initial
implemntation maintains an implicit invariant x=y
by using x := y
at the last action of every method.
However, the redefinition of n
in M
causes the invariant to be broken in case of a calling the modified version
of n
after m
. This will cause y
to be equal to 1, and x
to be equal to 3.
Conversely, the sequence of calls n;m
will cause a different kind of confusion. By looking at C
, one could assume
that the method calls will make x
equal to 3, whereas, in fact, x
will be assigned only 1.
Thus, the best way to avoid such confusion is by only allowing changes to the variables defined in the base class to be made via the corresponding methods of the base class.
In EO, base class state can be modelled with the use of
memory
functionality for variables and cage
functionality for objects.
So, having object a
with state state
:
[] > a
memory > state
[self new_state] > update_state
seq > @
self.state.write new_state
self.state
The proper way to change the state in the subclass b
would be:
[] > b
a > @
[self new_state] > change_state_plus_two
new_state.add 2 > tmp
seq > @
self.update_state self tmp
self.state
An improper way to achieve the same functionality in subclass bad
:
[] > bad
a > @
[self new_state] > change_state_plus_two
seq > @
self.state.write (new_state.add 2)
self.state
Access to state of any superclass is considered a bad practice. Examples can be found in functions read_func
and bad_func
.
NOTE: access to local state, such as
local_state
is not considered a defect.
[] > super_puper
memory > omega_state
[] > super
super_puper > @
memory > super_state
[self] > read_func
self.omega_state.add 2 > @
[] > parent
super > @
memory > parent_state
[] > child
parent > @
memory > local_state
[self] > bad_func
seq > @
self.omega_state.write 10
self.super_state.write 10
self.parent_state.write 10
self.local_state.write 10
Tree
structure from the source code and resolve all parents.memory
or cage
) accessible by each target subclass.Method 'change_state_plus_two' of object 'b' directly accesses state 'state' of base class 'a'
self
object are considered. So, self.state.write 10
is considered,
while state.write 10
is not considered, since it can potentially be a local variable.self
. The current implementation does not consider such cases.The analysis relies on derivation of logical properties functionality and data structures utilized in the unjustified assumption defect type.
The definition of the Liskov substitution principle is as follows:
The principle defines that objects of a superclass shall be replaceable with objects of its subclasses without breaking the application.
We focus on one specific constraint that Liskov substitution principle enforces:
A subtype is not substitutable for its super type if it strengthens its operations' preconditions, or weakens its operations' post conditions.
[] > base
[self x] > f
seq > @
assert (x.less 9)
x.add 1
[] > derived
base > @
[self x] > f
seq > @
assert (x.greater 9)
x.sub 1
Here the redefinition of method f
in the derived
object changes the original input domain of argument x
from (
-inf, 9) to (9, inf). This is a strengthening of the preconditions on method f
, which is a violation of the Liskov
substitution principle.
[] > base
[self x] > f
seq > @
assert (x.less 9)
x.add 1
[] > derived
base > @
[self x] > f
seq > @
assert (x.less 20)
x.sub 1
In this example the Liskov substitution principle is not violated, since method f
in the derived
object expands the
input domain of the function.
Method
a
of objectchild
violates the Liskov substitution principle as compared to version in parent objectgrandparent
Some assumptions are made about EO programs and used EO constructs during analysis for all type of defects. Additionally, some constructs and syntax are not supported intentionally.
distance.
point
5:x
-3:y
point:to
13:x
3.9:y
dx.pow 2
.add
dy.pow 2
.sqrt > length
[other] > add /int[?]
) are not implementedWhile, the imports are recognised by the analyzer in the form of +alias optional-alias imoprt
statements, the actual
code behind them is not used during analysis. More than that, the only instance when imports are considered is during
the setting-locators stage of the third defect analysis. Even then, the information is not used in any meaningful way.
This project is meant to be used as a module, and not a standalone application.
Nevertheless, it is still possible to test the available analyzers.
Running the project requires:
The API exposed by ODIN can be found in the interop project.
One can play around with the analyzer in the sandbox project by modifying the present code snippets or adding new ones.
The sandbox can be run via:
sbt sandbox/run
File containing the run
entrypoint can be found here
.
All existing tests can be run via:
sbt test
The source code can be run in scala REPL via:
sbt console
core
projectContains the EO AST.
All other projects should depend on this one.
analysis
projectContains the implementations of the analyzers odin can use.
backends
projectContains backends for the static analyzer. Backend is something that produces representations the EO AST.
eolang
backendBackend that can generate EO programs from an AST.
utils
projectConvenient tools that are used by other parts of the analyzer.
This project must not depend on any other project.
sandbox
projectAllows one to interactively run and manually test the analyzer. Facilitates the development and debug of the source code and allows one to see intermediate results of the development.
Any other project must not depend on it.
For more details on the project organization see:
parser
projectThis submodule holds the source code for EO parser written in Scala using a library
called cats-parse
. It should recognize any valid EO program and produce an
AST defined in the core
module.
When working on this parser, I tried to make it as close as possible
to specifications from the paper (Section 2, Figure 1), although, due to
specifics of Tymur's AST and the cats-parse
library, there were some diversions.
Most EO programs that were used to test this parser (but not all) are available
in parser/src/test/resources/eo_sources
. The tests themselves can be found in the form of ScalaTest unit tests
in parser/src/test/scala/org/polystat/odin/parser/cats-parse
.
The parser has also been tested on the randomly-generated EO programs. The ability to randonly generate EO programs and run the parser against them is provided by the Scalacheck library.
The existing EO parser implementation was not satisfactory for several reasons:
cats-parse
.master
branch has the latest changes and must always be buildable. All the changes are to be done by creating a pull
request. Once the CI run has successfully finished and a reviewer has approved changes, the code can be manually
merged to the master
branch.
When a pull request is sent to master
or a branch with a pull request to master
is pushed, the following checks will
run via GitHub Actions:
Build — all projects in the repository are built to check that the code compiles
Test — all tests are ran to check that new changes have not broken the existing functionality
Lint — run scalafix. If it fails run sbt scalafixAll
and fix issues that are not autofixable manually.
scalafix official documentation tells that
SemanticDB compiler plugin with -Yrangepos
flag adds overhead to the compilation, so it is recommended to create a
local .jvmopts
file with the following content:
-Xss8m
-Xms1G
-Xmx8G
Check code formatting — the code formatting will be checked via scalafmt
. If it fails run sbt scalafmtAll
and
commit the updated formatting.
For more information, see .github/workflows/ci.yml
.
FAQs
Odin (object dependency inspector) — static analyzer for EO source code that detects OOP-related bugs.
We found that org.polystat.odin:parser_3 demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.