GPS metadata and map with geolocation pin

Digging into digital images: Extracting batch location data automatically

A how-to guide covering the basic steps for using open-source metadata extraction software on your desktop and extracting geolocation information from a large number of images.

When an image is taken using a smartphone or camera certain metadata fields are often attached to it. These fields could include the model of the camera, the time it was taken, whether the flash was used, the shutter speed, focal length, light value and even the location. The Instagr.am investigation case study that is featured on Behind Metadata offers a few examples of how this geo-location metadata could be used to verify, expose and protect. 

There are a few ways to retrieve geo-location data from images, the easiest is to use an online tool, such as readexifdata.com or Jeffrey's Exif viewer, where the user can upload an image and look at the metadata information available. However, users can not know how those online services work and how secure uploaded images are when they are uploaded to unknown servers and often these online services are restrictive in the number of images they can process at a given time. This how-to covers two things: the basic steps for using open-source metadata extraction software on your desktop; and secondly how to extract geo-location information from a large number of images. 

Install

ExifTool is a software that can read, write and edit different metadata formats including EXIF, GPS, IPTC, XMP, JFIF, GeoTIFF, ICC Profile, Photoshop IRB, FlashPix, AFCP and ID3. It is a powerful tool with many possibilities, it offers users the option to organise images by their metadata and to extract geo-location information. 

You can get the source and download ExifTool through their website or you should be able to use your own package management system such as yum, pacman, apt-get or brew to easily install it on any UNIX Operating Systems (MacOS or Linux). 

This example shows how to install ExifTool on Debian terminal:

sudo apt-get install exiftool

Basic usage to read an image metadata

Go to your terminal and type the following command: "exiftool Ibiza.jpg" using your correct image path and filename. 

The correct path file depends on where it is stored, if it is in your Desktop the right command would be "exiftool /home/user/Desktop/Ibiza.jpg"

You should expect a list of data describing the image such as below:  

ExifTool Version Number         : 10.00
File Name                       : Example Ibiza.JPG
Directory                       : .
File Size                       : 229 kB
File Modification Date/Time     : 2015:10:11 18:58:06+02:00
File Access Date/Time           : 2015:10:13 12:41:47+02:00
File Inode Change Date/Time     : 2015:10:11 18:58:06+02:00
File Permissions                : rw-r--r--
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
JFIF Version                    : 1.01
Profile CMM Type                : Lino
Profile Version                 : 2.1.0
Profile Class                   : Display Device Profile
Color Space Data                : RGB
Profile Connection Space        : XYZ
Profile Date Time               : 1998:02:09 06:49:00
Profile File Signature          : acsp
Primary Platform                : Microsoft Corporation
CMM Flags                       : Not Embedded, Independent
Device Manufacturer             : IEC
Device Model                    : sRGB
Device Attributes               : Reflective, Glossy, Positive, Color
Rendering Intent                : Perceptual
Connection Space Illuminant     : 0.9642 1 0.82491
Profile Creator                 : HP
Profile ID                      : 0
Profile Copyright               : Copyright (c) 1998 Hewlett-Packard Company
Profile Description             : sRGB IEC61966-2.1
Media White Point               : 0.95045 1 1.08905
Media Black Point               : 0 0 0
Red Matrix Column               : 0.43607 0.22249 0.01392
Green Matrix Column             : 0.38515 0.71687 0.09708
Blue Matrix Column              : 0.14307 0.06061 0.7141
Device Mfg Desc                 : IEC http://www.iec.ch
Device Model Desc               : IEC 61966-2.1 Default RGB colour space - sRGB
Viewing Cond Desc               : Reference Viewing Condition in IEC61966-2.1
Viewing Cond Illuminant         : 19.6445 20.3718 16.8089
Viewing Cond Surround           : 3.92889 4.07439 3.36179
Viewing Cond Illuminant Type    : D50
Luminance                       : 76.03647 80 87.12462
Measurement Observer            : CIE 1931
Measurement Backing             : 0 0 0
Measurement Geometry            : Unknown
Measurement Flare               : 0.999%
Measurement Illuminant          : D65
Technology                      : Cathode Ray Tube Display
Red Tone Reproduction Curve     : (Binary data 2060 bytes, use -b option to extract)
Green Tone Reproduction Curve   : (Binary data 2060 bytes, use -b option to extract)
Blue Tone Reproduction Curve    : (Binary data 2060 bytes, use -b option to extract)
Exif Byte Order                 : Big-endian (Motorola, MM)
Make                            : Apple
Camera Model Name               : iPhone 4
Orientation                     : Horizontal (normal)
X Resolution                    : 72
Y Resolution                    : 72
Resolution Unit                 : inches
Software                        : 4.3.5
Modify Date                     : 2011:09:04 12:51:11
Exposure Time                   : 1/3016
F Number                        : 2.8
Exposure Program                : Program AE
ISO                             : 80
Exif Version                    : 0221
Date/Time Original              : 2011:09:04 12:51:11
Create Date                     : 2011:09:04 12:51:11
Components Configuration        : Y, Cb, Cr, -
Shutter Speed Value             : 1/3016
Aperture Value                  : 2.8
Metering Mode                   : Multi-segment
Flash                           : No Flash
Focal Length                    : 3.9 mm
Flashpix Version                : 0100
Color Space                     : sRGB
Exif Image Width                : 1024
Exif Image Height               : 765
Sensing Method                  : One-chip color area
Custom Rendered                 : Unknown (4)
Exposure Mode                   : Auto
White Balance                   : Auto
Scene Capture Type              : Standard
GPS Latitude Ref                : North
GPS Longitude Ref               : East
GPS Altitude Ref                : Above Sea Level
GPS Time Stamp                  : 11:07:47
GPS Img Direction Ref           : True North
GPS Img Direction               : 82.12307692
GPS Date Stamp                  : 2011:09:04
XMP Toolkit                     : XMP Core 5.1.2
Creator Tool                    : 4.3.5
Date Created                    : 2011:09:04 12:51:11
Image Width                     : 1024
Image Height                    : 765
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Aperture                        : 2.8
GPS Altitude                    : 0 m Above Sea Level
GPS Date/Time                   : 2011:09:04 11:07:47Z
GPS Latitude                    : 38 deg 54' 35.40" N
GPS Longitude                   : 1 deg 26' 19.20" E
GPS Position                    : 38 deg 54' 35.40" N, 1 deg 26' 19.20" E
Image Size                      : 1024x765
Megapixels                      : 0.783
Shutter Speed                   : 1/3016
Focal Length                    : 3.9 mm
Light Value                     : 14.9

From the metadata information above we can see information such as the geo-location of the image, the phone it was taken on (in this case an iPhone 4), the exact date it was taken, whether the image is light or dark (the illumination levels) and more. In order for this geo-colocation to be processed by most online mapping services you must provide a different way to represent the latitude and longitude other than the default format of ExifTool. 

The concise coordination format is set with the -c param as the next command shows. Type the following command into your terminal: 

exiftool -c "%.6f" Ibiza.JPG

GPS Latitude                    : 38.909833 N
GPS Longitude                   : 1.438667 E

This way you can use this location for Reverse Geocoding - the process of having a readable address or place name data from these coordinates data. The Nominatim OpenStreetMap Web Service can provide precise geographical metadata related to these coordinates.

It can also be directly searched on OpenStreetMaps here through querying only this latitude and longitude only.

Processing large amounts of images

However, imagine you have 50, 1,000 or 100,000 images that you want to see whether they have geo-location metadata related to image. It would be nearly impossible to check each one individually. This geo-location information can be a very important piece of the puzzle when conducting an investigation. However social media platforms often strip this metadata when users upload images to their servers and individuals might strip the metadata themselves when uploading images online. In a batch of online images you might expect as little as 2% of them to have geo-location information their metadata, as some of our tests shows. With this in mind the usefulness of automising the process of sifting through hundreds, or thousands, of these images becomes more apparent. 

We wrote a small script in the Ruby programming language that would not only sift through the images to automatically pick out this geo-location data but also plot them on a map. Place this script in a file called "geobatch.rb" and run in the folder with all the images you want to process: 

def get_gps_from_exif file
  `exiftool -c "%.6f" #{file} | grep GPS | grep Position`.scan(/(\d+\.\d+)/)
end

zoom = 2
path = "*.JPG"
map_url = "http://staticmap.openstreetmap.de/staticmap.php?&zoom=#{zoom}&size=865x512&maptype=mapnik&markers="
all = Dir.glob(path)
total = all.count
has_gps = 0
meta_exif = 0
all.each do |file|
  if gps = get_gps_from_exif(file)
    if gps.count==2 # lat and long
      coord = "#{gps[0][0]},#{gps[1][0]}"
      puts "=> #{file} @ #{coord}"
      map_url += "#{coord},lightblue#{file}|"
      meta_exif+=1
      has_gps+=1
    end
  end
end
puts "=> Total #{total} images | #{meta_exif} with EXIF | #{has_gps} with location"
puts ("=> Percentage with location = %3.2f" % [(has_gps*100).to_f/total])
puts "=> Map URL: #{map_url}"

The result will look similar as the text below with the number of images that have geo-location information attached to them (in this case there was four images).

$ ruby geo batch.rb
...
=> 4.JPG @ 37.529000,122.266000
=> 3.JPG @ 43.740167,7.430000
=> 2.JPG @ 37.421833,122.084333
=> 1.JPG @ 41.373167,2.189500
=> Total 35 images | 5 with EXIF | 4 with location
=> Percentage with location = 80.00
=> Map URL: http://staticmap.openstreetmap.de/staticmap.php?&zoom=2&size=865x512&maptype=mapnik&markers=37.529000,122.266000,lightblue4.JPG|43.740167,7.430000,lightblue3.JPG|37.421833,122.084333,lightblue2.JPG|41.373167,2.189500,lightblue1.JPG

This generates all these available coordinates mapped and accessible on a URL that can be viewed on OpenStreetMaps as the image below.

More about this topic