2

I am looking for help on constructing a KQL query to find country based on the IP.

So far I have a table called "Events" with 3 columns like so:

username, logintime, sourceIP

A concrete example would be:

bob, 2020-04-28T21:29:42Z, 1.2.3.4

I also have a table that lists the worldwide network block / subnet arrangement. An example of the data is as follows:

Network, geoname, code, continent, ISO, country

41.74.160.0/24, 49518, AF, Africa, FW, Rwanda

ADX has a function called ipv4_is_match() that will return true if the IP address 'lives' in a particular subnet. So I can run a query like this to return the location for an individual IP:

countryDB | where ipv4_is_match( '41.74.160.123', Network)

What I can't seem to figure out is how to run this across all sourceIP addresses in Events.

I would have liked something like this to work:

Events | union ( countryDB where ipv4_is_match(sourceIP, Network) )

Can anyone suggest what the proper syntax would be for creating the joined result similar to this:

bob, 2020-04-28T21:29:42Z, 1.2.3.4, 41.74.160.0/24, 49518, AF, Africa, FW, Rwanda

1 Answers1

1

Here is one way, use a join on a dummy column to create a Cartesian product and then filter the matches. Here is an example:

let Events = datatable(username:string, logintime:datetime, sourceIP:string) [ "bob", datetime(2020-04-28T21:29:42Z), "41.74.160.0"];
let CountryDB = datatable(Network:string, geoname:string, code:string, continent:string, ISO:string, country:string)["41.74.160.0/24", "49518", "AF", "Africa", "FW", "Rwanda"];
Events
| extend Dummy = 1
| lookup (CountryDB | extend Dummy=1) on Dummy
| where ipv4_is_match(sourceIP, Network)
| project-away Dummy
Avnera
  • 2,784
  • 2
  • 8
  • Thanks! That's a cool approach. The sample works but it fails for my data because the dataset gets too large (Query execution has resulted in error (0x80DA0007)). Its because the Events table is large. Any thoughts on how that can be managed? – user3030107 Nov 04 '20 at 21:58
  • 1
    Take a look at the partition operator: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/partitionoperator. Another strategy is enrich the table during ingestion using an update policy, this should help as the query will run on a limited dataset in each execution. – Avnera Nov 06 '20 at 16:45