User Manual

Quick start

PyEPR provides Python bindings for the ENVISAT Product Reader C API (EPR API) for reading satellite data from ENVISAT ESA (European Space Agency) mission.

PyEPR, as well as the EPR API for C, supports ENVISAT MERIS, AATSR Level 1B and Level 2 and also ASAR data products. It provides access to the data either on a geophysical (decoded, ready-to-use pixel samples) or on a raw data layer. The raw data access makes it possible to read any data field contained in a product file.

Full access to the Python EPR API is provided by the epr module that have to be imported by the client program e-g- as follows:

import epr

The following snippet open an ASAR product and dumps the “Main Processing Parameters” record to the standard output:

import epr

product = epr.Product(
    'ASA_IMP_1PNUPA20060202_062233_000000152044_00435_20529_3110.N1')
dataset = product.get_dataset('MAIN_PROCESSING_PARAMS_ADS')
record = dataset.read_record(0)
print(record)
product.close()

Requirements

In order to use PyEPR it is needed that the following software are correctly installed and configured:

  • Python2 >= 2.6 or Python3 >= 3.1
  • numpy >= 1.5.0
  • EPR API >= 2.2 (optional, since PyEPR 0.7 the source tar-ball comes with a copy of the PER C API sources)
  • a reasonably updated C compiler [1] (build only)
  • Cython >= 0.13 [2] (optional and build only)
  • unittest2 (only required for Python < 2.7)

Note

in order to build PyEPR for Python3 it is required Cython >= 0.15

[1]PyEPR has been developed and tested with gcc 4.
[2]

The source tarball of official releases also includes the C extension code generated by cython so users don’s strictly need cython to install PyEPR.

It is only needed to re-generate the C extension code (e.g. if one wants to build a development version of PyEPR).

Download

Official source tar-balls can be downloaded form PyPi:

The source code of the development versions is available on the GitHub project page

To clone the git repository the following command can be used:

$ git clone https://github.com/avalentino/pyepr.git

Installation

The easier way to install PyEPR is using tools like pip or easy_install:

$ pip install numpy pyepr

Note

the setup.py script does not use easy_install specific functions so it is unable to handle dependencies automatically.

Also the setup.py script uses numpy to retrieve the path of headers and libraries. For this reaseon numpy must be already installed when setup.py is executed.

In the above example the required numpy package is explicitly included in the list of packages to be installed.

See also

Requirements

PyEPR uses the standard Python distutils so it can be installed from sources using the following command:

$ python setup.py install

For a user specific installation use:

$ python setup.py install --user

To install PyEPR in a non-standard path:

$ python setup.py install --prefix=<TARGET_PATH>

just make sure that <TARGET_PATH>/lib/pythonX.Y/site-packages is in the PYTHONPATH.

The setup.py script by default checks for the availability of the EPR C API source code in the <package-root>/epr-api-src directory and tries to build PyEPR in standalone mode, i.e. without linking an external dynamic library of EPR-API.

If no EPR C API sources are found then the setup.py of PyEPR automatically tries to link the EPR-API dynamic library. This can happen, for example, if the user is using a copy of the PyEPR sources cloned from a git repository. In this case it is assumed that the EPR API C library is properly installed in the system (see the Requirements section).

It is possible to control which EPR API C sources to use by means of the --epr-api-src option of the setup.py script:

$ python setup.py install --epr-api-src=../epr-api/src

Also it is possible to switch off the standalone mode and force the link with the system EPR API C library:

$ python setup.py install --epr-api-src=None

Testing

PyEPR package comes with a complete test suite but in order to run it the ENVISAT sample product used for testing MER_LRC_2PTGMV20000620_104318_00000104X000_00000_00000_0001.N1 have to be downloaded from the ESA website, saved in the test directory and decompressed.

On GNU Linux platforms the following shell commands can be used:

$ cd pyepr-0.X/test
$ wget http://earth.esa.int/services/sample_products/meris/LRC/L2/\
  MER_LRC_2PTGMV20000620_104318_00000104X000_00000_00000_0001.N1.gz
$ gunzip MER_LRC_2PTGMV20000620_104318_00000104X000_00000_00000_0001.N1.gz

After installation the test suite can be run using the following command in the test directory:

$ python test_all.py

Python vs C API

The Python EPR API is fully object oriented. The main structures of the C API have been implemented as objects while C function have been logically grouped and mapped onto object methods.

The entire process of defining an object oriented API for Python has been quite easy and straightforward thanks to the good design of the C API,

Of course there are also some differences that are illustrated in the following sections.

Memory management

Being Python a very high level language uses have never to worry about memory allocation/deallocation. They simply have to instantiate objects:

product = epr.Product('filename.N1')

and use them freely.

Objects are automatically destroyed when there are no more references to them and memory is deallocated automatically.

Even better, each object holds a reference to other objects it depends on so the user never have to worry about identifiers validity or about the correct order structures have to be freed.

For example: the C EPR_DatasetId structure has a field (product_id) that points to the product descriptor EPR_productId to which it belongs to.

The reference to the parent product is used, for example, when one wants to read a record using the epr_read_record function:

EPR_SRecord* epr_read_record(EPR_SDatasetId* dataset_id, ...);

The function takes a EPR_SDatasetId as a parameter and assumes all fields (including dataset->product_id) are valid. It is responsibility of the programmer to keep all structures valid and free them at the right moment and in the correct order.

This is the standard way to go in C but not in Python.

In Python all is by far simpler, and the user can get a dateset object instance:

dataset = product.get_dataset('MAIN_PROCESSING_PARAMS_ADS')

and then forget about the product instance it depends on. Even if the product variable goes out of scope and it is no more directly accessible in the program the dataset object keeps staying valid since it holds an internal reference to the product instance it depends on.

When record is destroyed automatically also the parent epr.Product object is destroyed (assumed there is no other reference to it).

The entire machinery is completely automatic and transparent to the user.

Note

of course when a product object is explicitly closed using the epr.Product.close() any I/O operation on it and on other objects (bands, datasets, etc) associated to it is no more possible.

Arrays

PyEPR uses numpy in order to manage efficiently the potentially large amount of data contained in ENVISAT products.

  • epr.Field.get_elems() return an 1D array containing elements of the field

  • the Raster.data property is a 2D array exposes data contained in the epr.Raster object in form of numpy.ndarray

    Note

    epr.Raster.data directly exposes epr.Raster i.e. shares the same memory buffer with epr.Raster:

    >>> raster.get_pixel(i, j)
    5
    >>> raster.data[i, j]
    5
    >>> raster.data[i, j] = 3
    >>> raster.get_pixel(i, j)
    3
    
  • epr.Band.read_as_array() is an additional method provided by the Python EPR API (does not exist any correspondent function in the C API). It is mainly a facility method that allows users to get access to band data without creating an intermediate epr.Raster object. It read a slice of data from the epr.Band and returns it as a 2D numpy.ndarray.

Enumerators

Python does not have enumerators at language level (at least this is true for Python < 3.4). Enumerations are simply mapped as module constants that have the same name of the C enumerate but are spelled all in capital letters.

For example:

C Pythn
e_tid_double E_TID_DOUBLE
e_smod_1OF1 E_SMOD_1OF1
e_smid_log E_SMID_LOG

Error handling and logging

Currently error handling and logging functions of the EPR C API are not exposed to python.

Internal library logging is completely silenced and errors are converted to Python exceptions. Where appropriate standard Python exception types are use in other cases custom exception types (e.g. epr.EPRError, epr.EPRValueError) are used.

Library initialization

Differently from the C API library initialization is not needed: it is performed internally the first time the epr module is imported in Python.

High level API

PyEPR provides some utility method that has no correspondent in the C API:

Example:

for dataset in product.datasets():
    for record in dataset.records():
        print(record)
        print()

Another example:

if 'proc_data' in product.band_names():
    band = product.get_band('proc_data')
    print(band)

Special methods

The Python EPR API also implements some special method in order to make EPR programming even handy and, in short, pythonic.

The __repr__ methods have been overridden to provide a little more information with respect to the standard implementation.

In some cases __str__ method have been overridden to output a verbose string representation of the objects and their contents.

If the EPR object has a print_ method (like e.g. epr.Record.print_() and epr.Field.print_()) then the string representation of the object will have the same format used by the print_ method. So writing:

fd.write(str(record))

giver the same result of:

record.print_(fd)

Of course the epr.Record.print_() method is more efficient for writing to file.

Also epr.Dataset and epr.Record classes implement the __iter__ special method for iterating over records and fields respectively. So it is possible to write code like the following:

for record in dataset:
    for index, field in enumerate(record):
        print(index, field)

epr.DSD and epr.Field classes implement the __eq__ and __ne__ methods for objects comparison:

if filed1 == field2:
    print('field 1 and field2 are equal')
    print(field1)
else:
    print('field1:', field1)
    print('field2:', field2)

epr.Field object also implement the __len__ special method that returns the number of elements in the field:

if field.get_type() != epr.E_TID_STRING:
    assert field.get_num_elems() == len(field)
else:
    assert len(field) == len(field.get_elem())

Note

differently from the epr.Field.get_num_elems() method len(field) return the number of elements if the field type is not epr.E_TID_STRING. If the field contains a string then the string length is returned.

Finally the epr.Product class acts as a context manager (i.e. it implements the __enter__ and __exit__ methods).

This allows the user to write code like the following:

with epr.open('ASA_IMS_ ... _4650.N1') as product:
    print(product)

that ensure that the product is closed as soon as the program exits the with block.