EggLib documentation

EggLib is a C++/Python library and program package for evolutionary genetics and genomics. Main features are sequence data management, sequence polymorphism analysis, coalescent simulations and Approximate Bayesian Computation. EggLib is a flexible Python module with a performant underlying C++ library (which can be used independently), and allows fast and intuitive development of Python programs and scripts. A number of pre-programmed applications of EggLib possibilities are available interactively. To get an idea of the possibilities offered by EggLib, see the Manual section.

Get EggLib: instructionsdownload.

Citation: De Mita S. and M. Siol. 2012. EggLib: processing, analysis and simulation tools for population genetics and genomics. BMC Genet. 13:27. Open access.

What’s new ?

  • October 04, 2014: Version 2.1.9 is out to fix a rather minor error affecting the Staden parser only (in particular staden2fasta).

  • September 23, 2014: Version 2.1.8 is out to fix a heavy error affecting the abc_sample command and more precisely the TPS, TPF and TPK summary statistics sets. The error was that the program used only the last locus for computing per-population \pi (ignoring all previous ones). The three summary statistics sets are fixed. This error did not affect other summary statistics sets or the other statistics of those three summary statistics sets. Since this behaviour was consistent for both observed data sets and simulations, the consequence of the error was, in principle, a complete lack of resolution of these summary statistics sets rather than erroneous results, but it is much likely that results using these summary statistics sets were inaccurate.

    Also: version 2.2.0 has been elevated to version 3 due to large-scale changes in the interface. A preliminary package of version 3 is out (for testing purpose).

  • November 20, 2013: The pre-compiled package of version 2.1.7 is available for Python 2.7 32 bits.

  • November 7, 2013: Version 2.1.7 (source only, at the moment) is out to include minor changes. Warning: If you use non-standard genetic codes, you must update because there was an error preventing the code argument to be passed to underlying polymorphism analysis routines in egglib-py.

  • April 22, 2013: Version 2.1.6 is out to match version 2.1.0 of Bio++. No other change has been made. Meanwhile, egglib 2.2.0 is still under development: the C++ library has been redesigned for improved performance, VCF and GFF3 support has been included population genomics analyses and we are now incorporating additional summary statistics.

  • October 20, 2012: Release of version 2.1.5 incorporating small changes. Be careful if you have used or using the SM model in the ABC framework as the parameters were not named properly.


  • An underlying C++ library which might be used independently.

  • Two standalone programs:
    • eggcoal: an extensive coalescent simulator.
    • eggstats: a simple command line tool for analyzing diversity in fasta files.
  • A flexible Python module bringing together the C++ library and additional high-level tools: Python module.

  • A script egglib providing a number of modular tools for processing and analyzing sequence data (and others). See Directly executable commands.


These pages describe the Python module and the C++ library. They are available as an independent downloadable archive from the download site. A pdf version of the general description of EggLib and reference manual of egglib-py is available here and a pdf version of the reference manual of the C++ library is available there.

Detailed contents

Indices and tables