Class WebcrawlerConnector.CanonicalizationPolicy

  • Enclosing class:
    WebcrawlerConnector

    protected static class WebcrawlerConnector.CanonicalizationPolicy
    extends java.lang.Object
    Class representing a URL regular expression match, for the purposes of determining canonicalization policy
    • Field Detail

      • matchPattern

        protected final java.util.regex.Pattern matchPattern
      • reorder

        protected final boolean reorder
      • removeJavaSession

        protected final boolean removeJavaSession
      • removeAspSession

        protected final boolean removeAspSession
      • removePhpSession

        protected final boolean removePhpSession
      • removeBVSession

        protected final boolean removeBVSession
      • lowercasing

        protected final boolean lowercasing
    • Constructor Detail

      • CanonicalizationPolicy

        public CanonicalizationPolicy​(java.util.regex.Pattern matchPattern,
                                      boolean reorder,
                                      boolean removeJavaSession,
                                      boolean removeAspSession,
                                      boolean removePhpSession,
                                      boolean removeBVSession,
                                      boolean lowercasing)
    • Method Detail

      • checkMatch

        public boolean checkMatch​(java.lang.String url)
      • canReorder

        public boolean canReorder()
      • canRemoveJavaSession

        public boolean canRemoveJavaSession()
      • canRemoveAspSession

        public boolean canRemoveAspSession()
      • canRemovePhpSession

        public boolean canRemovePhpSession()
      • canRemoveBvSession

        public boolean canRemoveBvSession()
      • canLowercase

        public boolean canLowercase()