Chat with usChat with us

Batch import your locations CSV file using Woosmap API

batch-import-csv-api

One of the first things a new Woosmap user does to get started and before embedding the Woosmap WebApp in his site is to import his location data using Woosmap Data Management API. To fulfil this requirement, we provide you through this guide a python sample to help you dealing with your locations as a CSV Spreadsheet.



Parse and process your CSV File

Sample CSV File

The data used to illustrate this point is an extracted CSV from Data Gouv France representing all museums in France. You can see below a subset of what the file museum-france-data.csv looks like:

Image

This CSV File contains some null geometry values (not well geocoded) and columns that we’re not interested in. To deal with it, we need to clean this dataset. csvkit is a suite of command-line tools for converting to and working with CSV. In our case, especially csvgrep and csvcut are very helpful to extract the desired data. You could work with a graphical user interface app, like Apple Numbers, Microsoft Excel or Google Sheet but csvkit lets you work with your data more efficiently.


Retain only all non-zero geocoded data (where latitude field is not empty).

csvgrep -d ";" -c latitude -r "^(?=\s*\S).*$" france-museum-.csv |

Then extract the desired columns.

csvcut -c "NOM DU MUSEE",ADR,CP,VILLE,SITWEB,latitude,longitude > france_museum_geocoded.csv

CSV Python Module

Because Woosmap Data Management API only accepts JSON as input, the next step is to transform each row of your file to Woosmap elements. The following python script is based on the native csv module. It makes it easier to deal with csv formatted file, especially when working with data exported from spreadsheets and databases into text files. We preferred to iterate over the data rows using csv.DictReader instead of classic reader mainly because it takes advantage of the header row with the column names and gives you the opportunity to grab it cell value by its name.


Dialect

There is no well-defined standard for comma-separated value files, there are plenty of ways one CSV file can differ from another, yet contains exactly the same data. Many tools, which can import or export tabular data allow the user to indicate the field delimiter, quote character, line terminator, and other characteristics of the file. All these parameters are grouped together conveniently into a dialect object.



When creating a csv.DictReader object, the programmer can specify a subclass of the Dialect class as the dialect parameter. In our CSV file the fields are separated by commas and some values are double quoted.

import csv
from _csv import QUOTE_ALL
from csv import Dialect
class data_gov_dialect(Dialect):
delimiter =','
quotechar = '"'
doublequote = True
skipinitialspace = False
lineterminator = '\n'
quoting = QUOTE_ALL
csv.register_dialect('dg', data_gov_dialect)

Work with Woosmap API

Data Structure

The one step further is to transform each row element to a location object that follows the required structured format. The mandatory fields are a storeId (as String), a name and a location geometry object (a couple of Latitude and Longitude). Below is the method used to transform each row of our CSV to a Woosmap location element:

def get_geometry(store):
return {
'lat': store['latitude'],
'lng': store['longitude']
}
def get_contact(store):
return {
'website': store['SITWEB']
}
def get_address(store):
return {
'lines': [store['ADR']],
'city': store['VILLE'],
'zipcode': store['CP'],
}
def datagov2woosmap(store, id):
geometry = get_geometry(store)
address = get_address(store)
contact = get_contact(store)
return {
'storeId': id,
'name': store['NOM DU MUSEE'],
'address': address,
'contact': contact,
'location': geometry
}

We are now able to open the file and iterate over the data to build an array of all the Woosmap location elements before POST them to Woosmap API:

with open(endpoint_csv, 'r') as f: #open the file
reader = csv.DictReader(f, dialect="dg") #read it as a dict of dict
batch = []
for location in reader: #loop over this dictreader
try:
woosmap_location = datagov2woosmap(location) #build the woosmap element
batch.append(woosmap_location) #and append it to an array
except InvalidGeometry:
pass
import(batch) #finally import all your data

Import using HTTP POST Method

The Woosmap Data API is a RESTful API with endpoint https://api.woosmap.com/stores. It takes a mandatory parameter : your private key. Therefore the url you call should look like this: htps://api.woosmap.com/?private_key=YOUR_PRIVATE_KEY. The API provides 4 HTTP methods for interacting with resources but in our case we are interested in creating new resources so we’ll use the POST method.

import requests
private_key = ''  #your woosmap_api_private_key here
endpoint_api = 'https://api.woosmap.com/stores'
def import(batch):
session = requests.Session()
response = session.post(endpoint_api,
params={'private_key': private_key},
json={'stores': batch})

Rolling batch import

If you work with more than 1000 locations, we recommend you to set a rolling import every X elements. To do this, define a numerical constant in your code (100 for example) and execute your POST request each time this counter value is reached.

batch_size = 100
batch = []
for location in data: #loop over this array
try:
woosmap_location = datagov2woosmap(location) #build the woosmap element
if len(batch) == batch_size: 
import(batch) #import when you batch array contains 100 elements
batch = []
else:
batch.append(woosmap_location)
except InvalidGeometry:
pass
if batch:
import(batch) #finally import the remaining elements 

As you could see in the sample hosted on GitHub, the code allows you to update your stores using PUT http verb and DELETE the whole resource before re-creating it. Don’t forget that the update method is based on unique identifier of your data so be careful to keep the same identifier.



Sources