Class DomSerializer


  • public class DomSerializer
    extends Object

    DOM serializer - creates xml DOM.

    • Field Detail

      • props

        protected CleanerProperties props
        The HTML Cleaner properties set by the user to control the HTML cleaning.
      • escapeXml

        protected boolean escapeXml
        Whether XML entities should be escaped or not.
      • deserializeCdataEntities

        protected boolean deserializeCdataEntities
      • strictErrorChecking

        protected boolean strictErrorChecking
    • Constructor Detail

      • DomSerializer

        public DomSerializer​(CleanerProperties props,
                             boolean escapeXml,
                             boolean deserializeCdataEntities,
                             boolean strictErrorChecking)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
        escapeXml - if true then escape XML entities
        deserializeCdataEntities - if true then deserialize entities in CData sections
        strictErrorChecking - if false then Document strict error checking is turned off
      • DomSerializer

        public DomSerializer​(CleanerProperties props,
                             boolean escapeXml,
                             boolean deserializeCdataEntities)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
        escapeXml - if true then escape XML entities
        deserializeCdataEntities - if true then deserialize entities in CData sections
      • DomSerializer

        public DomSerializer​(CleanerProperties props,
                             boolean escapeXml)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
        escapeXml - if true then escape XML entities
      • DomSerializer

        public DomSerializer​(CleanerProperties props)
        Parameters:
        props - the HTML Cleaner properties set by the user to control the HTML cleaning.
    • Method Detail

      • isScriptOrStyle

        protected boolean isScriptOrStyle​(Element element)
        Parameters:
        element - the element to check
        Returns:
        true if the passed element is a script or style element
      • dontEscape

        protected boolean dontEscape​(Element element)
        encapsulate content with <[CDATA[ ]]> for things like script and style elements
        Parameters:
        element -
        Returns:
        true if <[CDATA[ ]]> should be used.
      • outputCData

        protected String outputCData​(CData cdata)
      • deserializeCdataEntities

        protected String deserializeCdataEntities​(String input)
      • createSubnodes

        protected void createSubnodes​(Document document,
                                      Element element,
                                      List<? extends BaseToken> tagChildren)
        Serialize a given HTML Cleaner node.
        Parameters:
        document - the W3C Document to use for creating new DOM elements
        element - the W3C element to which we'll add the subnodes to
        tagChildren - the HTML Cleaner nodes to serialize for that node