We’re on DuckDuckGo

Users can now run legislation searches directly from the DuckDuckGo search engine. We’re using “!laws“.

Just type “!laws climate change” in DuckDuckGo to be taken directly to Global-Regulation.com’s search results for “climate change”.


Big Data With Purpose: How We Calculated the Fines of 1.55 Million Laws

This is a technical explanation of how we built our “PenaltyAI Search” service that combs 1.55 million world laws from 79 countries for fines. It can answer questions like “What would I pay for violating money laundering laws in Jamaica?” or “How much would a smuggler who warehouses stolen goods in China pay if they’re caught?“.

The penalties are extracted by an offline algorithm that runs on an Azure VM that does the following steps:

  1. Find laws that mention keywords associated with civil penalties (as a first pass)
  2. Convert all word numbers (like “one million”) into international number format (“1,000,000.00”)
  3. Identify the paragraphs that likely contain civil penalties based on words and numbers
  4. Merge several penalties into one, whether they related to the same “clause” (section) of a law
  5. Extract all the clauses and penalties
  6. Exclude certain classes of text that are almost never penalties but look like penalties (such as laws about gold coins and section references in laws that have to do with money)
  7. Recognize currencies in text, and combine this data with our table of national currencies, and convert penalties into USD using Yahoo! Finance rates (through the XML API call)
  8. Store the penalties and clauses in a MySQL database (RDS)

Screenshot of one of the MySQL tables for penalties

We then note in our search instance whether or not a law has penalties attached to it, so that the search instance can filter by laws that have penalties (as opposed to our regular search that includes laws that don’t have explicit fines attached to them). This process is run as a batch job offline because our 1.55 million+ laws takes several hours to process and no one would wait that long for their search results!

When a user does a search, the search is first sent to our Elasticsearch instance, and then the penalties are looked up from the MySQL database afterwards. This allows full-text search of laws to be combined with penalties, and in a way that results in much less strain on our relational database (because penalties are looked up by IDs rather than a JOIN). Storing the penalties separately allows us to reduce the amount of data in the in-memory search instance, and decouples our services (since we have other types of search like technical standards and law analytics).

The laws themselves are indexed, downloaded, converted to text, parsed, and converted to English, using our pipeline that runs on another Azure VM with RDS as the data store. We make extensive use of the Microsoft Translator API to convert foreign legislation to English (since most of the world’s laws are published in languages other than English). Our use of the service is actually listed on the “Customers” page for Microsoft Translator. We’ve written elsewhere on our blog about some of the ways we gather and process world legislation.


LexisNexis vs. Westlaw: How Many Countries Can You Search?

Which countries can be searched on global legal research platforms? According to our research, Westlaw (as of 2017) has legislation search for 14 countries (counting the EU as a country) and LexisNexis has 12 countries.

Westlaw (Thomson Reuters) and LexisNexis (RELX Group) are the two largest legal research companies in the world. Wolters Kluwer is a close third place, and in some jurisdictions is the main legal research company (they’re about 50/50 EU and North America). All of these companies offer a bewildering list of databases and sources, and none of them bundle multiple country search into one search engine. On one webpage, Westlaw claims that they make available 28,000 different databases worldwide.

According to our research, these are the countries for which LexisNexis has primary legal research search (i.e. national laws in a searchable format):

US, Canada, Australia, New Zealand, France, Ireland, India, UK, EU, South Africa, Hong Kong, Malaysia and Japan.

Westlaw (Thomson Reuters) offers the following countries:

Australia, New Zealand, Canada, UK, EU, USA, Philippines, Qatar, Iraq, UAE, Hong Kong, Argentina, Paraguay and Uruguay.

LexisNexis Sources: http://www.lexisnexis.ca/en/support/resources/KM_LNQLFullServiceInt.pdf, https://www.lexisnexis.com/fr/droit/, http://www.lexisnexis.co.za/, http://www.lexisnexis.com.hk/en-hk/product-line/legal.page, http://www.lexisnexis.jp/ja-jp/Products/lexis-asone.page, http://www.lexisnexis.co.in/en-in/products/lexis-india.page, http://www.lexisnexis.co.nz/en-nz/home.page. One Lexis page, http://www.lexisnexis.com/en-us/products/global-law-news-business-resources.page, notes that there are nine countries in total but that doesn’t seem to be the case.

Westlaw Sources: http://westlawinternational.com/our-solutions/, http://www.westlaw.ie/, http://westlawgulf.com/inside-west-gulf/legislation/, http://www.laleyonline.com.ar/, http://www.laleyonline.com.py/, http://www.laleyonline.com.uy/, http://www.thomsonreuters.co.nz/westlawnz, http://www.westlawasia.com/philippines.

A caveat to the above infographic: there may be countries that either one of these companies offers legislation search for that they either don’t advertise or is very difficult to discover. They essentially operate as independent businesses in many countries and have other subscription services that are sub-licensed, so there may be other flags missing from the infographic above. If you discover a missing country please let us know so we can update this blog post.


Finding Foreign Laws in English

You can read translations of over 750,000 foreign laws using Global-Regulation.com. Just search for the the phrase and click a law from a non-English jurisdiction. A machine translation of the law will be shown on screen and you can click through to see the law in the original language.

ISO codes on the database coverage page

If you go to our coverage page you’ll see a list of our data sources. The bracketed codes to the right of the region title are the ISO codes for the language. The screenshot to the left shows a few examples of this. Note that “zh” is Mandarin, “es” is Spanish and “cs” is Czech.

As of mid-October, 2016 we have over 25 languages translated to English.


Upgrade to MySQL Cluster

We upgraded our MySQL-based law database system a couple weeks ago. We’re now running a cluster with a writer and reader so that the failure of one server doesn’t result in downtime for our users. This new cluster-based system is approximately 15x faster than the old one due to a much higher amount of RAM (necessary to accommodate our rapidly growing index of laws and translations). Although we’ve updated many other parts of the system, we had been running on the same HDD-based MySQL system since October of last year. Our new system is SSD-based and has far higher throughput.

Users will primarily see the difference in how much quicker the “Related Laws” feature is. We’ve measured the performance at about 15x the previous MySQL server system. Thanks to our cloud-based infrastructure and standardized components it was quite easy to make the transition. Within about two days we went from experimenting to a full transition (and no reported downtime from users).


Co-Founder’s Blog Post About Global-Regulation.com

One of our two co-founders has written on his personal blog about the journey from idea to world-leading law search engine that Global-Regulation Inc. has taken. You can read the blog post here: https://www.cameronhuff.com/blog/idea-to-world-law-search-engine-in-one-year/index.html.


Related Laws Feature

We’re pleased to announce that our “Related Laws” feature is now generally available (and fast). When users click the “Related Laws & More Info” button next to each search result there will automatically be a list of related laws generated and shown (where applicable – short laws don’t have this feature because the results aren’t useful).

This feature was previously available and marked as “experimental”. With recent upgrades to our database system we’ve improved the speed of this feature by 15x and can make it widely available to everyone (including users who are not paid subscribers). This is one part of our strategy for making the search experience faster and more useful.


Recently Added Countries

We’ve added a few new countries:

  1. Uruguay
  2. Moldova
  3. Turkmenistan
  4. Norway
  5. Greenland
  6. Madagascar
  7. Malaysia
  8. Greece
  9. Guyana

Photo by John Seb Barber.


Our English Translations of Chinese Laws

Wondering where to find Chinese laws translated into English? For many Chinese federal and state laws, you can find them by searching our database: https://www.global-regulation.com/search.php?year=&country=China&province=All&start=0&q=china&advanced=false.

Global-Regulation’s search engine currently has 7910 Chinese laws translated into English. The translations are done with either Microsoft Bing Translator or Google Translate.


Visual Explorer of Top 25 Word Pairs By Country

We’ve prepared an interactive visualization that shows the relationships between word pairs in laws and countries from 45 countries.

Check it out here: https://www.global-regulation.com/visualization.php

Raw data files here (.gephi and .gexp file): https://www.global-regulation.com/assets/visualization/top_25_word_pairs_gephi_and_gexp_file.zip.