
Security News
CISA Kills Off RSS Feeds for KEVs and Cyber Alerts
CISA is discontinuing official RSS support for KEV and cybersecurity alerts, shifting updates to email and social media, disrupting automation workflows.
Prda contains packages for data processing, analysis and visualization. The ultimate goal is to fill the “last mile” between analysts and packages.
Prda ultimate goal is to fill the “last mile” between analysts and packages. During my research practice, I have felt how “learning a package before utilizing” can be time-consuming and exhausting. The resulted inefficiency leads to the creation of prda.
pip install prda
See details in: https://pypi.org/project/prda/
You are welcome to clone prda for personal use and pull request of your modification is super!! encouraged.
To utilize prda, you only need to be familiar with pandas
as most inputs is pd.DataFrame
.
Currently with the help of ChatGPT, you can just tailor the input of demonstration code below to your data. And you don't need to be familiar with pandas or even python.
import prda
import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.array([np.arange(100) for i in range(5)]).T,columns=['a', 'b', 'c', 'd', 'e'])
prda.graphic.scatter_3d_html(df, x='a', y='b', z='c', color_hue='d', size_hue='e', title='demo_3d_scatter', filepath='demo_3d_scatter.html')
the above code will provide an interactive html figure that look like this:
import prda
import pandas as pd
import numpy as np
datalen = 500
indices = np.arange(datalen)
col_a = np.arange(0, 10, 10/datalen)
col_b = np.random.randint(3, 8, datalen)
data = np.array([indices, col_a, col_b]).T
df = pd.DataFrame(data=data, columns=['idx', 'a', 'b'])
# draw
import random
point_markers = {
'a': [(indices[i], col_a[i]) for i in random.sample(list(indices), 20)]
}
prda.graphic.lineplot_html(df, x='idx', y=['a', 'b'], markpoints=point_markers, filepath='demo_lineplot.html')
idx | a | b | |
---|---|---|---|
0 | 0.0 | 0.00 | 6.0 |
1 | 1.0 | 0.02 | 3.0 |
2 | 2.0 | 0.04 | 4.0 |
... | ... | ... | ... |
498 | 498.0 | 9.96 | 6.0 |
499 | 499.0 | 9.98 | 5.0 |
And code with the above DataFrame will draw anther plot look like this:
Code for filtering continuous variables in data with unique-value threshold of 5:
from prda import prep
prep.select_continuous_variables(data, unique_threshold=5)
Code for evaluating hyperparameters combinations for a given algorithm using user-specified cross-validation method:
from prda.ml import evaluations
param_grid = {'k': [4,5,6,7]}
evaluations.evaluate_param_combinations(X, y, knn_algorithm, param_grid=param_grid, cv=10, visualize_results=True)
A common usage during my research practice is to make well structured folders to save experimential results. With the following function, you only need to think about how you want your files to be structured. All related folders
will be created automatically:
from prda import iostream
iostream.create_dirs([
'results/experiment1/f1_score.csv',
'results/experiment1/accuracy.csv',
'results/experiment2/',
'results/experiment10/accuracy/',
'results/experiment10/f1_score/r1.txt',
])
The above one-line code will create all the folders for you which will have the corresponding structure below, after which you can then store your results without worrying about file structures whatsoever.
results/
├── experiment1/
│ ├── f1_score.csv
│ └── accuracy.csv
├── experiment2/
└── experiment10/
├── accuracy/
└── f1_score/
└── r1.txt
The prda
's methods are quite self-explanatory, as a result, we think providing the above demonstration is suffice at the moment. Although the current prda is far from completion, let along perfection. It is under improvement regularly.
Add several easy-to-use functions, including prep::
pca, select_continuous_variables, handle_missing_data, apply_linear_func(row-wisely), and ml::
match_clusters, evaluate_param_combinations(optimal parameters searching, with base class::sklearn.base.BaseEstimator), etc.
ml::neighbors
::VariableKNN. The algorithm behaves as a sklearn.classifier
which means you can employ it directly via fit(·) and predict(·)
.iostream::
create_dirs.FAQs
Prda contains packages for data processing, analysis and visualization. The ultimate goal is to fill the “last mile” between analysts and packages.
We found that prda demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CISA is discontinuing official RSS support for KEV and cybersecurity alerts, shifting updates to email and social media, disrupting automation workflows.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.