CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete and further processable information about the rendered page contents and layout. However, it also allows displaying the rendered document.
The input of the rendering engine is the document DOM tree and a set of style sheets referenced from the document. The output is an object-oriented model of the page layout. This model can be directly displayed but mainly, it is suitable for further processing by the layout analysis algorithms as for example the page segmentation or information extraction algorithms.