
Security News
NIST Under Federal Audit for NVD Processing Backlog and Delays
As vulnerability data bottlenecks grow, the federal government is formally investigating NISTβs handling of the National Vulnerability Database.
A python package to help Data Scientists, Machine Learning Engineers and Analysts better understand data. Gives quick insights about given data; general dataset statistics, shape of dataset, number of unique data types, number of numerical and non-numerical columns, missing data statistics, missing data heatmap and provides methodology to impute missing data.
Why datastand? Data + Understand
A python package to help Data Scientists, Machine Learning Engineers and Analysts better understand data. Gives quick insights about a given dataset.
Run the following command on the terminal to install the package:
pip install datastand
Code:
from datastand import datastand
import pandas as pd
df = pd.read_csv("path/to/target/dataframe")
datastand(df)
Output:
General stats:
==================
Shape of DataFrame: (1202, 13)
Number of unique data types : {dtype('int64'), dtype('O')}
Number of numerical columns: 2
Number of non-numerical columns: 11
Missing data:
=======================
DataFrame contains 2670 missing values (17.09%) as follows column-wise:
-----------------------------------------------------------------------
Gender 41
Car_Category 372
Subject_Car_Colour 697
Subject_Car_Make 248
LGA_Name 656
State 656
dtype: int64
-----------------------------------------------------------------------
Do you wish to long-list missing data statistics?(y/n): y
.
.
.
Code:
# This function is already available in the DataStand class and also available separately
# Here we're running it separately
from datastand import plot_missing
plot_missing(df)
Output:
Code:
from datastand import impute_missing
impute_missing(df)
Output:
Imputing missing data...
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 80/80 [00:02<00:00, 30.52it/s]
Imputation complete.
Vincent N. [LinkedIn] [Twitter]
FAQs
A python package to help Data Scientists, Machine Learning Engineers and Analysts better understand data. Gives quick insights about given data; general dataset statistics, shape of dataset, number of unique data types, number of numerical and non-numerical columns, missing data statistics, missing data heatmap and provides methodology to impute missing data.
We found that datastand demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.Β It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
As vulnerability data bottlenecks grow, the federal government is formally investigating NISTβs handling of the National Vulnerability Database.
Research
Security News
Socketβs Threat Research Team has uncovered 60 npm packages using post-install scripts to silently exfiltrate hostnames, IP addresses, DNS servers, and user directories to a Discord-controlled endpoint.
Security News
TypeScript Native Previews offers a 10x faster Go-based compiler, now available on npm for public testing with early editor and language support.