Product
Introducing License Enforcement in Socket
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Data generator for generating repeatable random data suitable for use in MongoDB databases. The data is designed to be random but self consistent. So user IDs increase consistently, session docs for login and logout are correctly ordered and sequence of login and logout events are also temporarily ordered.
We also ensure that session events happen only after the registered date for the user.
We generate one collection by default USERS/profiles
. Users can generate
a second collection by specifying the --session
argument. The sessions
collection is called USERs.sessions
.
To install use pip
or pipenv
. You must be using python 3.6 or greater.
$ pip install mongodbrdg
Collecting mongodbrdg
...
Successfully built mongodbrdg
Installing collected packages: mongodbrdg
Successfully installed mongodbrdg-0.2a1
mongodbrdg
installs on your python path and can be run by invoking
$ mongodbrdg
It expects to have a mongod
running on the default port (27017). You can point the program at a different
mongod
and/or cluster by using the --mongodb
argument.
$ mongodbrdg -h
usage: mongodbrdg [-h] [--mongodb MONGODB] [--database DATABASE]
[--collection COLLECTION] [--idstart IDSTART]
[--idend IDEND] [--maxfriends MAXFRIENDS] [--seed SEED]
[--drop] [--report] [--session {none,random,count}]
[--sessioncount SESSIONCOUNT]
[--sessioncollection SESSIONCOLLECTION]
[--bucketsize BUCKETSIZE] [--stats] [-locale LOCALE]
[--batchsize BATCHSIZE]
mongodbrdg 0.4.5. Generate random JSON data for MongoDB (requires python 3.6).
optional arguments:
-h, --help show this help message and exit
--mongodb MONGODB MongoDB host: [default: mongodb://localhost:27017]
--database DATABASE MongoDB database name: [default: USERS]
--collection COLLECTION
Default collection for random data:[default: profiles]
--idstart IDSTART The starting value for a user_id range [default: 0]
--idend IDEND The end value for a user_id range: [default: 10]
--maxfriends MAXFRIENDS
Specify max number of friend to include in profile
[default: 0]
--seed SEED Use this seed value to ensure you always get the same
data
--drop Drop data before creating a new set [default: False]
--report send all generated JSON to the screen [default: False]
--session {none,random,count}
Generate a sessions collection [default: none do not
generate]
--sessioncount SESSIONCOUNT
Default number of sessions to generate.Gives the
random bound for random sessions [default: 5]
--sessioncollection SESSIONCOLLECTION
Name of sessions collection: [default: sessions]
--bucketsize BUCKETSIZE
Bucket size for insert_many [default: 1000]
--stats Report time to insert data
-locale LOCALE Locale to use for data: [default: en]
--batchsize BATCHSIZE
How many docs to insert per batch: [default: 1000]
$```
# Example Data
The random data that is random but looks real. We use the python [mimesis](https://mimesis.readthedocs.io/) package
for this purpose. There are two separate collections that can be created. The
first is the `profiles` collection which contains example user records. A typical
example document is:
```json
{
"first_name": "Donnetta",
"last_name": "Page",
"gender": "FEMALE",
"company": "Syntel",
"email": "Donnetta.Page@syntel.rio",
"registered": "2010-06-09 11:06:05.882643",
"user_id": 0,
"country": "United States",
"city": "Hayward",
"phone": "1-171-738-1641",
"location": {
"type": "Point",
"coordinates": [
164.393576,
59.535072
]
},
"language": "Dari",
"interests": [
"Reading",
"politics"
]
}
The second is the the sessions
document.
If the user asks for sessions to be generated the we will generate a seperate
collection keyed by the user_id
and generate a collection of session
documents.
Session documents look like this:
{
"user_id": 0,
"login": "2021-07-02 22:30:28.790370"
}
{
"user_id": 0,
"logout": "2021-07-04 00:15:29.543370"
}
They always come in matched pairs. A login
document and a logout
document. These
docs are keyed by the user_id
field which always matches to a valid profile
doc.
Some simple examples of the program in action.
The parameter --idstart
defaults to zero so this generates a python
range of
0..1 which creates one document.
$ mongodbrdg --idend 1
Inserted 1 user docs into USERS.profiles
$
We use the --report
object to spit out the JSON to stdout. Note that
although we generate datetime
objects internally for insertion we convert these to a string representation
for output to JSON.
$ mongodbrdg --idend 1 --report
{
"first_name": "Patrick",
"last_name": "David",
"gender": "MALE",
"company": "Danaher",
"email": "Patrick.David@danaher.info",
"registered": "2034-07-08 12:06:05.728825",
"user_id": 0,
"country": "United States",
"city": "Yucca Valley",
"phone": "(065) 868-4054",
"location": {
"type": "Point",
"coordinates": [
-42.659858,
-2.631433
]
},
"language": "Malagasy",
"interests": []
}
Inserted 1 user docs into USERS.profiles
$ mongodbrdg --idend 1000
Inserted 1000 user docs into USERS.profiles
$
Use the --seed
option to specify a random integer seed. Using the same
seed ensures that the identical set of random data will be generated
every time.
mongodbrdg --idend 1 --report --seed 123
{
"first_name": "Billy",
"last_name": "Evans",
"gender": "MALE",
"company": "Antec",
"email": "Billy.Evans@antec.lt",
"registered": "2018-05-03 13:17:06.879234",
"user_id": 0,
"country": "United States",
"city": "Battle Ground",
"phone": "1-288-353-0157",
"location": {
"type": "Point",
"coordinates": [
-83.63622,
48.41215
]
},
"language": "Dhivehi",
"interests": [
"Darts",
"Golf",
"politics",
"Board gaming",
"Football"
]
}
Inserted 1 user docs into USERS.profiles
We specify that we want session records with the --session count
arg. This
tells use to generate a specific number of sessions. We then specify the number
of sessions with --sessioncount
. A single session generates a login
and a
logout
document. For any given session the logout
session always happens
after the login
session.
You can specify the following arguments to --session
:
count
: generate a specific number of sessions per userrandom
: generate a random number of sessions between 0 and --sessioncount
none
(default) : Do not generate session data$ mongodbrdg --idend 1 --report --session count --sessioncount 1
{
"first_name": "Jetta",
"last_name": "Cline",
"gender": "FEMALE",
"company": "Integra Design",
"email": "Jetta.Cline@integradesign.aero",
"registered": "2029-05-09 19:40:34.866333",
"user_id": 0,
"country": "United States",
"city": "Topeka",
"phone": "(724) 108-6398",
"location": {
"type": "Point",
"coordinates": [
-154.636317,
34.894447
]
},
"language": "Malayalam",
"interests": [
"Football",
"skydiving",
"Running",
"politics",
"Triathlon"
]
}
{
"user_id": 0,
"login": "2029-05-10 23:28:34.980333"
}
{
"user_id": 0,
"logout": "2029-05-12 01:36:35.248333"
}
Inserted 1 user docs into USERS.profiles
Inserted 2 session docs into USERS.sessions
$
For data requiring a graph structure we can add a friends field to the profile
using the --maxfriends
field. The default for --maxfriends
field is 0.
When the value is zero the field is omitted. For any value greater than zero we
generate 0..maxfriends
friends for each user. The friends are selected at
random from --idstart
to --idend
.
$ mongodbrdg --idend 10 --maxfriends 2 --report
{
"first_name": "Lavona",
"last_name": "Rodriquez",
"gender": "FEMALE",
"company": "Atari",
"email": "Lavona.Rodriquez@atari.ua",
"registered": "2005-08-10 22:02:45.687955",
"user_id": 0,
"country": "United States",
"city": "Watsonville",
"phone": "1-407-350-4386",
"location": {
"type": "Point",
"coordinates": [
-26.850068,
-56.593144
]
},
"language": "Japanese",
"friends": [
1,
2
],
"interests": [
"Triathlon"
]
}
...
"first_name": "Enrique",
"last_name": "Dennis",
"gender": "MALE",
"company": "ABX Air",
"email": "Enrique.Dennis@abxair.consulting",
"registered": "2019-02-17 00:09:41.859429",
"user_id": 9,
"country": "United States",
"city": "Hemet",
"phone": "159-721-4912",
"location": {
"type": "Point",
"coordinates": [
-32.912016,
-11.078288
]
},
"language": "Tsonga",
"friends": [
1,
7
],
"interests": [
"Running",
"Darts"
]
}
Inserted 10 user docs into USERS.profiles
FAQs
MongoDB RDG - Random data generator for MongoDB
We found that mongodbrdg demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Product
We're launching a new set of license analysis and compliance features for analyzing, managing, and complying with licenses across a range of supported languages and ecosystems.
Product
We're excited to introduce Socket Optimize, a powerful CLI command to secure open source dependencies with tested, optimized package overrides.