Socket
Book a DemoInstallSign in
Socket

geolocation_service

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

geolocation_service

0.1.2
bundlerRubygems
Version published
Maintainers
1
Created
Source

FindHotel Coding Challenge

The FindHotel coding challenge consists of two parts, a library and a REST API application:

  • A library with two main features:
    • A service that parses the CSV file containing the raw data and persists it in a database;
    • An interface to provide access to the geolocation data (model layer);
  • A REST API that uses the aforementioned library to expose the geolocation data.

This repository contains my solution to the library. You can find my solution to the REST API application here: https://github.com/jalerson/geolocation_api.

Geolocation Service

The library was developed as a Rails Engine gem, which can be easily and seamlessly integrated into any Rails application.

Installation

Add this line to your application's Gemfile:

gem 'geolocation_service'

And then execute:

$ bundle

Install the gem's migrations:

$ rails geolocation_service_engine:install:migrations

Execute pending migrations:

$ rake db:migrate

Usage

The gem provides four new models: Ip, City, Country and Location.

Models and associations

In order to import data, you must use the GeolocationService::Services::ImportBulkDataService.

GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')

The service returns a Dry::Monad::Result indicating the success or failure of the importing operation.

result = GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')

if result.success?
  # do something...
else
  # do something else...
end

In a successful importing operation, the result will contain an instance of GeolocationService::ImportResult, which has:

  • imported_records: number of imported records
  • invalid_records: number of invalid records
  • time_consumed: time consumed in seconds

In a failure importing operation, the result will contain an error/exception.

result = GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')

if result.success?
  import_results = result.value!
  Rails.logger.info "Records imported in #{import_results.time_consumed} seconds"
else
  error = result.failure
  Rails.logger.error error.message
end

Design decisions

The main guidance for design decisions in this project was: provide the best importing performance while keeping the database normalized.

In order to achieve the best importing performance using a normalized database, experiments were conduct to seek for the best performance of (a) converting the CSV data to a format/representation that could be validated and stored into the database, (b) data validation and (c) actually store the data into the database. In each case, the alternatives considered were:

(a) CSV data conversion: ActiveRecord instances, simple Ruby classes (no ActiveRecord), Arrays/Hashes and Structs.

(b) data validation: ActiveRecord validations or contracts/schemas.

(c) store into the database: in order to keep the performance as best as possible, the alternatives considered are limited to those which persist a set of records in a single INSERT statement: activerecord-import gem or writing and sending the SQL statements to the database.

After several experiments with different combinations, the chosen approach is using Structs, contracts/schemas and send SQL statements directly to the database. This particular combination presented a great performance when importing one million records in approximately 4 minutes while avoiding duplicates and keeping a clean code.

Trade-offs

In order to keep the importing performance as best as possible, two design decisions have consequences which users need to be aware of.

  • Memory usage: when importing a set of records, the service will load all existing records in memory and also add new records. This way the service avoids creating duplicated records.
  • id (primary key) needs to be manually set: in order to guarantee the proper relationship constraints between records in the database when importing records, the id must be set manually in all tables, except locations.

FAQs

Package last updated on 22 Sep 2019

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.