Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

atlantis

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

atlantis

Python library for simplifying data science

  • 2023.6.21
  • PyPI
  • Socket score

Maintainers
1

Atlantis

Atlantis is a Python library for simplifying programming with Python for data science.

Installation

You can just use pip to install Atlantis:

pip install atlantis

Modules

  • collections helps with working with collections.
  • colour simplifies using colours.
  • ds (datascience) provides tools for:
    • data wrangling,
    • validation,
    • tuning,
    • sampling,
    • evaluation,
    • clustering, and
    • parallel processing of machine learning models.
  • functions manages higher order functions.
  • hash simplifies and standardizes hashing.
  • text makes working with texts and strings easy.
  • time
    • provides methods for interacting with time and date as well as
    • progress bars

collections

This module of the package atlantis helps with working with collections.

flatten

from atlantis.collections import flatten
flatten([1, 2, [3, 4, [5, 6], 7], 8])

returns: [1, 2, 3, 4, 5, 6, 7, 8]

List

This class inherits from Python's list class but implements a few additional functionalities.

from atlantis.collections import List
l = List(1, 2, 3, 4, 2, [1, 2], [1, 2])

Flattening:

l.flatten()
>>> List: [1, 2, 3, 4, 2, 1, 2, 1, 2]

Finding duplicates:

l.get_duplicates()
>>> List: [2, List: [1, 2]]

Note: the list elements of a List automatically get converted to Lists, recursively.

ds (Data Science)

This module provides data science tools for:

  • data wrangling,
  • validation,
  • tuning,
  • sampling,
  • evaluation,
  • clustering, and
  • parallel processing of machine learning models.

KMeans Clustering

I have used the KMeans class from both sklearn and that of pyspark and was frustrated by two problems: (a) even though the two classes do exactly the same thing their interfaces are vastly different and (b) some of the simplest operations are very hard to do with both classes. I solved this problem by creating my own KMeans class that is a wrapper aroung both of those classes and uses the appropriate one automatically without complicating it for the data scientist programmer.

Usage
from atlantis.ds.clustering import KMeans

kmeans = KMeans(n_clusters=3, n_jobs=10)
kmeans.fit(X=X)

predictions = kmeans.predict(X=X)
transformed_x = kmeans.transform(X=X)

Clustering Optimization

Usage
from atlantis.ds.clustering import ClusteringOptimizer

clustering_optimizer = ClusteringOptimizer(min_k=2, max_k=16, n_jobs=10)
clustering_optimizer.fit(X=X)
print(f'best number of clusters: {clustering_optimizer.optimal_number_of_clusters}')

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc