Data assimilation aims at determining as accurately as possible the state of a dynamical system by combining heterogeneous sources of information in an optimal way. Generally speaking, the mathematical methods of data assimilation describe algorithms for forming optimal combinations of observations of a system, a numerical model that describes its evolution, and appropriate prior information. Data assimilation has a long history of application to high-dimensional geophysical systems dating back to the 1960s, with application to the estimation of initial conditions for weather forecasts. It has become a major component of numerical forecasting systems in geophysics, and an intensive field of research, with numerous additional applications in oceanography, atmospheric chemistry, and extensions to other geophysical sciences. The physical complexity and the high dimensionality of geophysical systems have led the community of geophysics to make significant contributions to the fundamental theory of data assimilation. This book gathers notes from lectures and seminars given by internationally recognized scientists during a three-week school held in the Les Houches School of physics in 2012, on theoretical and applied data assimilation. It is composed of (i) a series of main lectures, presenting the fundamentals of the most commonly used methods, and the information theory background required to understand and evaluate the role of observations; (ii) a series of specialized lectures, addressing various aspects of data assimilation in detail, from the most recent developments of the theory to the specificities of various thematic applications.