Package org.htmlcleaner
Class CleanerProperties
- java.lang.Object
-
- org.htmlcleaner.CleanerProperties
-
- All Implemented Interfaces:
HtmlModificationListener
public class CleanerProperties extends Object implements HtmlModificationListener
Properties defining cleaner's behaviour
-
-
Field Summary
Fields Modifier and Type Field Description static String
BOOL_ATT_EMPTY
static String
BOOL_ATT_SELF
static String
BOOL_ATT_TRUE
static String
DEFAULT_CHARSET
-
Constructor Summary
Constructors Constructor Description CleanerProperties()
CleanerProperties(ITagInfoProvider tagInfoProvider)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addHtmlModificationListener(HtmlModificationListener listener)
Adds a listener to the list of objects that will be notified about changes that cleaner does during cleanup process.void
addPruneTagNodeCondition(ITagNodeCondition condition)
Adds the condition to existing prune tag set.void
fireConditionModification(ITagNodeCondition condition, TagNode tagNode)
Fired when cleaner modifies html due toITagNodeCondition
match.void
fireHtmlError(boolean certainty, TagNode startTagToken, ErrorType type)
Fired when cleaner fixes some error in html syntax.void
fireUglyHtml(boolean certainty, TagNode startTagToken, ErrorType errorType)
Fired when cleaner fixes ugly html -- when syntax was correct but task was implemented by weird code.void
fireUserDefinedModification(boolean certainty, TagNode tagNode, ErrorType errorType)
Fired when cleaner modifies html due to user specified rules.String
getAllowTags()
Set<ITagNodeCondition>
getAllowTagSet()
String
getBooleanAttributeValues()
String
getCharset()
CleanerTransformations
getCleanerTransformations()
int
getHtmlVersion()
Return the html versionString
getHyphenReplacementInComment()
String
getInvalidXmlAttributeNamePrefix()
Get the prefix to use to try to make valid attribute namesString
getPruneTags()
Set<ITagNodeCondition>
getPruneTagSet()
ITagInfoProvider
getTagInfoProvider()
String
getUseCdataFor()
boolean
isAddNewlineToHeadAndBody()
boolean
isAdvancedXmlEscape()
boolean
isAllowHtmlInsideAttributes()
boolean
isAllowInvalidAttributeNames()
If false, when outputting XML, if an attribute name is not valid, attempt to fix it by using a prefix and removing invalid characters.boolean
isAllowMultiWordAttributes()
boolean
isDeserializeEntities()
boolean
isIgnoreQuestAndExclam()
boolean
isKeepWhitespaceAndCommentsInHead()
boolean
isNamespacesAware()
boolean
isOmitCdataOutsideScriptAndStyle()
boolean
isOmitComments()
boolean
isOmitDeprecatedTags()
boolean
isOmitDoctypeDeclaration()
boolean
isOmitHtmlEnvelope()
boolean
isOmitUnknownTags()
boolean
isOmitXmlDeclaration()
boolean
isRecognizeUnicodeChars()
boolean
isTranslateSpecialEntities()
boolean
isTransResCharsToNCR()
boolean
isTransSpecialEntitiesToNCR()
boolean
isTreatDeprecatedTagsAsContent()
boolean
isTreatUnknownTagsAsContent()
boolean
isTrimAttributeValues()
boolean
isUseCdataFor(String useCdataFor)
boolean
isUseCdataForScriptAndStyle()
boolean
isUseEmptyElementTags()
void
reset()
advancedXmlEscape = true; setUseCdataFor("script,style"); translateSpecialEntities = true; recognizeUnicodeChars = true; omitUnknownTags = false; treatUnknownTagsAsContent = false; omitDeprecatedTags = false; treatDeprecatedTagsAsContent = false; omitComments = false; omitXmlDeclaration = OptionalOutput.alwaysOutput; omitDoctypeDeclaration = OptionalOutput.alwaysOutput; omitHtmlEnvelope = OptionalOutput.alwaysOutput; useEmptyElementTags = true; allowMultiWordAttributes = true; allowHtmlInsideAttributes = false; ignoreQuestAndExclam = true; namespacesAware = true; keepHeadWhitespace = true; addNewlineToHeadAndBody = true; hyphenReplacementInComment = "="; pruneTags = null; allowTags = null; booleanAttributeValues = BOOL_ATT_SELF; collapseNullHtml = CollapseHtml.none charset = "UTF-8"; trimAttributeValues = true; tagInfoProvider = HTML5TagProvider.INSTANCEvoid
setAddNewlineToHeadAndBody(boolean addNewlineToHeadAndBody)
void
setAdvancedXmlEscape(boolean advancedXmlEscape)
void
setAllowHtmlInsideAttributes(boolean allowHtmlInsideAttributes)
void
setAllowInvalidAttributeNames(boolean allowInvalidAttributeNames)
Set whether to allow invalid attribute names, or to try to fix or omit themvoid
setAllowMultiWordAttributes(boolean allowMultiWordAttributes)
void
setAllowTags(String allowTags)
void
setBooleanAttributeValues(String booleanAttributeValues)
void
setCharset(String charset)
void
setCleanerTransformations(CleanerTransformations cleanerTransformations)
void
setDeserializeEntities(boolean deserializeEntities)
void
setHtmlVersion(int version)
Sets the html version according to the parameter.Also,it sets the tag provider to the appropriate version.void
setHyphenReplacementInComment(String hyphenReplacementInComment)
void
setIgnoreQuestAndExclam(boolean ignoreQuestAndExclam)
void
setInvalidXmlAttributeNamePrefix(String invalidXmlAttributePrefix)
Sets the prefix to use for xml attributes that are invalidvoid
setKeepWhitespaceAndCommentsInHead(boolean keepHeadWhitespace)
void
setNamespacesAware(boolean namespacesAware)
void
setOmitCdataOutsideScriptAndStyle(boolean value)
void
setOmitComments(boolean omitComments)
void
setOmitDeprecatedTags(boolean omitDeprecatedTags)
void
setOmitDoctypeDeclaration(boolean omitDoctypeDeclaration)
void
setOmitHtmlEnvelope(boolean omitHtmlEnvelope)
void
setOmitUnknownTags(boolean omitUnknownTags)
void
setOmitXmlDeclaration(boolean omitXmlDeclaration)
void
setPruneTags(String pruneTags)
Resets prune tags set and adds tag name conditions to it.void
setRecognizeUnicodeChars(boolean recognizeUnicodeChars)
void
setTranslateSpecialEntities(boolean translateSpecialEntities)
TODO : useOptionalOutput
void
setTransResCharsToNCR(boolean transResCharsToNCR)
void
setTransSpecialEntitiesToNCR(boolean transSpecialEntitiesToNCR)
void
setTreatDeprecatedTagsAsContent(boolean treatDeprecatedTagsAsContent)
void
setTreatUnknownTagsAsContent(boolean treatUnknownTagsAsContent)
void
setTrimAttributeValues(boolean trimAttributeValues)
void
setUseCdataFor(String useCdataFor)
void
setUseCdataForScriptAndStyle(boolean useCdataForScriptAndStyle)
void
setUseEmptyElementTags(boolean useEmptyElementTags)
-
-
-
Field Detail
-
DEFAULT_CHARSET
public static final String DEFAULT_CHARSET
- See Also:
- Constant Field Values
-
BOOL_ATT_SELF
public static final String BOOL_ATT_SELF
- See Also:
- Constant Field Values
-
BOOL_ATT_EMPTY
public static final String BOOL_ATT_EMPTY
- See Also:
- Constant Field Values
-
BOOL_ATT_TRUE
public static final String BOOL_ATT_TRUE
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
CleanerProperties
public CleanerProperties()
-
CleanerProperties
public CleanerProperties(ITagInfoProvider tagInfoProvider)
- Parameters:
tagInfoProvider
-
-
-
Method Detail
-
getTagInfoProvider
public ITagInfoProvider getTagInfoProvider()
-
isAdvancedXmlEscape
public boolean isAdvancedXmlEscape()
-
setAdvancedXmlEscape
public void setAdvancedXmlEscape(boolean advancedXmlEscape)
-
isTransResCharsToNCR
public boolean isTransResCharsToNCR()
-
setTransResCharsToNCR
public void setTransResCharsToNCR(boolean transResCharsToNCR)
-
isUseCdataForScriptAndStyle
public boolean isUseCdataForScriptAndStyle()
-
setUseCdataForScriptAndStyle
public void setUseCdataForScriptAndStyle(boolean useCdataForScriptAndStyle)
-
setUseCdataFor
public void setUseCdataFor(String useCdataFor)
-
getUseCdataFor
public String getUseCdataFor()
-
isUseCdataFor
public boolean isUseCdataFor(String useCdataFor)
-
isTranslateSpecialEntities
public boolean isTranslateSpecialEntities()
-
setTranslateSpecialEntities
public void setTranslateSpecialEntities(boolean translateSpecialEntities)
TODO : useOptionalOutput
- Parameters:
translateSpecialEntities
-
-
isRecognizeUnicodeChars
public boolean isRecognizeUnicodeChars()
-
setRecognizeUnicodeChars
public void setRecognizeUnicodeChars(boolean recognizeUnicodeChars)
-
isOmitUnknownTags
public boolean isOmitUnknownTags()
-
setOmitUnknownTags
public void setOmitUnknownTags(boolean omitUnknownTags)
-
isTreatUnknownTagsAsContent
public boolean isTreatUnknownTagsAsContent()
-
setTreatUnknownTagsAsContent
public void setTreatUnknownTagsAsContent(boolean treatUnknownTagsAsContent)
-
isOmitDeprecatedTags
public boolean isOmitDeprecatedTags()
-
setOmitDeprecatedTags
public void setOmitDeprecatedTags(boolean omitDeprecatedTags)
-
isTreatDeprecatedTagsAsContent
public boolean isTreatDeprecatedTagsAsContent()
-
setTreatDeprecatedTagsAsContent
public void setTreatDeprecatedTagsAsContent(boolean treatDeprecatedTagsAsContent)
-
isOmitComments
public boolean isOmitComments()
-
setOmitComments
public void setOmitComments(boolean omitComments)
-
isOmitXmlDeclaration
public boolean isOmitXmlDeclaration()
-
setOmitXmlDeclaration
public void setOmitXmlDeclaration(boolean omitXmlDeclaration)
-
isOmitDoctypeDeclaration
public boolean isOmitDoctypeDeclaration()
- Returns:
- also return true if omitting the Html Envelope
-
setOmitDoctypeDeclaration
public void setOmitDoctypeDeclaration(boolean omitDoctypeDeclaration)
-
isOmitHtmlEnvelope
public boolean isOmitHtmlEnvelope()
-
setOmitHtmlEnvelope
public void setOmitHtmlEnvelope(boolean omitHtmlEnvelope)
-
isUseEmptyElementTags
public boolean isUseEmptyElementTags()
-
setUseEmptyElementTags
public void setUseEmptyElementTags(boolean useEmptyElementTags)
-
isAllowMultiWordAttributes
public boolean isAllowMultiWordAttributes()
-
setAllowMultiWordAttributes
public void setAllowMultiWordAttributes(boolean allowMultiWordAttributes)
-
isAllowHtmlInsideAttributes
public boolean isAllowHtmlInsideAttributes()
-
setAllowHtmlInsideAttributes
public void setAllowHtmlInsideAttributes(boolean allowHtmlInsideAttributes)
-
isIgnoreQuestAndExclam
public boolean isIgnoreQuestAndExclam()
-
setIgnoreQuestAndExclam
public void setIgnoreQuestAndExclam(boolean ignoreQuestAndExclam)
-
isNamespacesAware
public boolean isNamespacesAware()
-
setNamespacesAware
public void setNamespacesAware(boolean namespacesAware)
-
isAddNewlineToHeadAndBody
public boolean isAddNewlineToHeadAndBody()
-
setAddNewlineToHeadAndBody
public void setAddNewlineToHeadAndBody(boolean addNewlineToHeadAndBody)
-
isKeepWhitespaceAndCommentsInHead
public boolean isKeepWhitespaceAndCommentsInHead()
-
setKeepWhitespaceAndCommentsInHead
public void setKeepWhitespaceAndCommentsInHead(boolean keepHeadWhitespace)
-
getHyphenReplacementInComment
public String getHyphenReplacementInComment()
-
setHyphenReplacementInComment
public void setHyphenReplacementInComment(String hyphenReplacementInComment)
-
getPruneTags
public String getPruneTags()
-
isOmitCdataOutsideScriptAndStyle
public boolean isOmitCdataOutsideScriptAndStyle()
-
setOmitCdataOutsideScriptAndStyle
public void setOmitCdataOutsideScriptAndStyle(boolean value)
-
isDeserializeEntities
public boolean isDeserializeEntities()
-
setDeserializeEntities
public void setDeserializeEntities(boolean deserializeEntities)
-
setHtmlVersion
public void setHtmlVersion(int version)
Sets the html version according to the parameter.Also,it sets the tag provider to the appropriate version.- Parameters:
version
- Number 4 for html4 or 5 for html5
-
getHtmlVersion
public int getHtmlVersion()
Return the html version- Returns:
- int The html version
-
isTrimAttributeValues
public boolean isTrimAttributeValues()
-
setTrimAttributeValues
public void setTrimAttributeValues(boolean trimAttributeValues)
-
setPruneTags
public void setPruneTags(String pruneTags)
Resets prune tags set and adds tag name conditions to it. All the tags listed by pruneTags param are added.- Parameters:
pruneTags
-
-
addPruneTagNodeCondition
public void addPruneTagNodeCondition(ITagNodeCondition condition)
Adds the condition to existing prune tag set.- Parameters:
condition
-
-
getPruneTagSet
public Set<ITagNodeCondition> getPruneTagSet()
-
getAllowTags
public String getAllowTags()
-
setAllowTags
public void setAllowTags(String allowTags)
-
isTransSpecialEntitiesToNCR
public boolean isTransSpecialEntitiesToNCR()
-
setTransSpecialEntitiesToNCR
public void setTransSpecialEntitiesToNCR(boolean transSpecialEntitiesToNCR)
-
getAllowTagSet
public Set<ITagNodeCondition> getAllowTagSet()
-
setCharset
public void setCharset(String charset)
- Parameters:
charset
- the charset to set
-
getCharset
public String getCharset()
- Returns:
- the charset
-
getBooleanAttributeValues
public String getBooleanAttributeValues()
-
setBooleanAttributeValues
public void setBooleanAttributeValues(String booleanAttributeValues)
-
reset
public void reset()
advancedXmlEscape = true; setUseCdataFor("script,style"); translateSpecialEntities = true; recognizeUnicodeChars = true; omitUnknownTags = false; treatUnknownTagsAsContent = false; omitDeprecatedTags = false; treatDeprecatedTagsAsContent = false; omitComments = false; omitXmlDeclaration = OptionalOutput.alwaysOutput; omitDoctypeDeclaration = OptionalOutput.alwaysOutput; omitHtmlEnvelope = OptionalOutput.alwaysOutput; useEmptyElementTags = true; allowMultiWordAttributes = true; allowHtmlInsideAttributes = false; ignoreQuestAndExclam = true; namespacesAware = true; keepHeadWhitespace = true; addNewlineToHeadAndBody = true; hyphenReplacementInComment = "="; pruneTags = null; allowTags = null; booleanAttributeValues = BOOL_ATT_SELF; collapseNullHtml = CollapseHtml.none charset = "UTF-8"; trimAttributeValues = true; tagInfoProvider = HTML5TagProvider.INSTANCE
-
getCleanerTransformations
public CleanerTransformations getCleanerTransformations()
- Returns:
- the cleanerTransformations
-
setCleanerTransformations
public void setCleanerTransformations(CleanerTransformations cleanerTransformations)
-
addHtmlModificationListener
public void addHtmlModificationListener(HtmlModificationListener listener)
Adds a listener to the list of objects that will be notified about changes that cleaner does during cleanup process.- Parameters:
listener
- -- listener object to be notified of the changes.
-
fireConditionModification
public void fireConditionModification(ITagNodeCondition condition, TagNode tagNode)
Description copied from interface:HtmlModificationListener
Fired when cleaner modifies html due toITagNodeCondition
match.- Specified by:
fireConditionModification
in interfaceHtmlModificationListener
- Parameters:
condition
- that was applied to make the modificationtagNode
- - problematic node.
-
fireHtmlError
public void fireHtmlError(boolean certainty, TagNode startTagToken, ErrorType type)
Description copied from interface:HtmlModificationListener
Fired when cleaner fixes some error in html syntax.- Specified by:
fireHtmlError
in interfaceHtmlModificationListener
- Parameters:
certainty
- - true if change made doesn't hurts end document.startTagToken
- - problematic node.
-
fireUglyHtml
public void fireUglyHtml(boolean certainty, TagNode startTagToken, ErrorType errorType)
Description copied from interface:HtmlModificationListener
Fired when cleaner fixes ugly html -- when syntax was correct but task was implemented by weird code. For example when deprecated tags are removed.- Specified by:
fireUglyHtml
in interfaceHtmlModificationListener
- Parameters:
certainty
- - true if change made doesn't hurts end document.startTagToken
- - problematic node.
-
fireUserDefinedModification
public void fireUserDefinedModification(boolean certainty, TagNode tagNode, ErrorType errorType)
Description copied from interface:HtmlModificationListener
Fired when cleaner modifies html due to user specified rules.- Specified by:
fireUserDefinedModification
in interfaceHtmlModificationListener
- Parameters:
certainty
- - true if change made doesn't hurts end document.tagNode
- - problematic node.
-
getInvalidXmlAttributeNamePrefix
public String getInvalidXmlAttributeNamePrefix()
Get the prefix to use to try to make valid attribute names- Returns:
-
setInvalidXmlAttributeNamePrefix
public void setInvalidXmlAttributeNamePrefix(String invalidXmlAttributePrefix)
Sets the prefix to use for xml attributes that are invalid- Parameters:
invalidXmlAttributePrefix
-
-
setAllowInvalidAttributeNames
public void setAllowInvalidAttributeNames(boolean allowInvalidAttributeNames)
Set whether to allow invalid attribute names, or to try to fix or omit them- Parameters:
allowInvalidAttributeNames
-
-
isAllowInvalidAttributeNames
public boolean isAllowInvalidAttributeNames()
If false, when outputting XML, if an attribute name is not valid, attempt to fix it by using a prefix and removing invalid characters. Otherwise, omit invalid attributes- Returns:
-
-