Class KafkaOutputConnector

  • All Implemented Interfaces:
    org.apache.manifoldcf.agents.interfaces.IOutputConnector, org.apache.manifoldcf.agents.interfaces.IPipelineConnector, org.apache.manifoldcf.core.interfaces.IConnector

    public class KafkaOutputConnector
    extends org.apache.manifoldcf.agents.output.BaseOutputConnector
    This is a kafka output connector.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String _rcsid  
      protected static java.lang.String allowAttributeName
      The allow attribute name
      protected static java.lang.String denyAttributeName
      The deny attribute name
      static java.lang.String INGEST_ACTIVITY
      Ingestion activity
      static java.lang.String JOB_COMPLETE_ACTIVITY
      Job notify activity
      protected static java.lang.String noSecurityToken
      The no-security token
      protected static boolean useNullValue  
      • Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector

        currentContext, params
      • Fields inherited from interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector

        DOCUMENTSTATUS_ACCEPTED, DOCUMENTSTATUS_REJECTED
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int addOrReplaceDocumentWithException​(java.lang.String documentURI, org.apache.manifoldcf.core.interfaces.VersionContext pipelineDescription, org.apache.manifoldcf.agents.interfaces.RepositoryDocument document, java.lang.String authorityNameString, org.apache.manifoldcf.agents.interfaces.IOutputAddActivity activities)
      Add (or replace) a document in the output data store using the connector.
      java.lang.String check()
      Test the connection.
      void connect​(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
      Connect.
      void disconnect()
      Close the connection.
      java.lang.String[] getActivitiesList()
      Return the list of activities that this connector supports (i.e.
      org.apache.manifoldcf.core.interfaces.VersionContext getPipelineDescription​(org.apache.manifoldcf.core.interfaces.Specification spec)
      Get an output version string, given an output specification.
      void noteJobComplete​(org.apache.manifoldcf.agents.interfaces.IOutputNotifyActivity activities)
      Notify the connector of a completed job.
      void outputConfigurationBody​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName)  
      void outputConfigurationHeader​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray)  
      java.lang.String processConfigurationPost​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)  
      void setProducer​(org.apache.kafka.clients.producer.KafkaProducer producer)  
      void viewConfiguration​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)  
      • Methods inherited from class org.apache.manifoldcf.agents.output.BaseOutputConnector

        checkDateIndexable, checkDocumentIndexable, checkLengthIndexable, checkMimeTypeIndexable, checkURLIndexable, getFormCheckJavascriptMethodName, getFormPresaveCheckJavascriptMethodName, noteAllRecordsRemoved, outputSpecificationBody, outputSpecificationHeader, processSpecificationPost, removeDocument, requestInfo, viewSpecification
      • Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector

        clearThreadContext, deinstall, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, poll, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfiguration
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface org.apache.manifoldcf.core.interfaces.IConnector

        clearThreadContext, deinstall, getConfiguration, install, isConnected, poll, processConfigurationPost, setThreadContext
    • Field Detail

      • INGEST_ACTIVITY

        public static final java.lang.String INGEST_ACTIVITY
        Ingestion activity
        See Also:
        Constant Field Values
      • JOB_COMPLETE_ACTIVITY

        public static final java.lang.String JOB_COMPLETE_ACTIVITY
        Job notify activity
        See Also:
        Constant Field Values
      • allowAttributeName

        protected static final java.lang.String allowAttributeName
        The allow attribute name
        See Also:
        Constant Field Values
      • denyAttributeName

        protected static final java.lang.String denyAttributeName
        The deny attribute name
        See Also:
        Constant Field Values
      • noSecurityToken

        protected static final java.lang.String noSecurityToken
        The no-security token
        See Also:
        Constant Field Values
    • Constructor Detail

      • KafkaOutputConnector

        public KafkaOutputConnector()
        Constructor.
    • Method Detail

      • setProducer

        public void setProducer​(org.apache.kafka.clients.producer.KafkaProducer producer)
      • getActivitiesList

        public java.lang.String[] getActivitiesList()
        Return the list of activities that this connector supports (i.e. writes into the log).
        Specified by:
        getActivitiesList in interface org.apache.manifoldcf.agents.interfaces.IOutputConnector
        Overrides:
        getActivitiesList in class org.apache.manifoldcf.agents.output.BaseOutputConnector
        Returns:
        the list.
      • connect

        public void connect​(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
        Connect.
        Specified by:
        connect in interface org.apache.manifoldcf.core.interfaces.IConnector
        Overrides:
        connect in class org.apache.manifoldcf.core.connector.BaseConnector
        Parameters:
        configParameters - is the set of configuration parameters, which in this case describe the target appliance, basic auth configuration, etc. (This formerly came out of the ini file.)
      • disconnect

        public void disconnect()
                        throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Close the connection. Call this before discarding the connection.
        Specified by:
        disconnect in interface org.apache.manifoldcf.core.interfaces.IConnector
        Overrides:
        disconnect in class org.apache.manifoldcf.core.connector.BaseConnector
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • outputConfigurationHeader

        public void outputConfigurationHeader​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
                                              org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
                                              java.util.Locale locale,
                                              org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
                                              java.util.List<java.lang.String> tabsArray)
                                       throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                              java.io.IOException
        Specified by:
        outputConfigurationHeader in interface org.apache.manifoldcf.core.interfaces.IConnector
        Overrides:
        outputConfigurationHeader in class org.apache.manifoldcf.core.connector.BaseConnector
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • outputConfigurationBody

        public void outputConfigurationBody​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
                                            org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
                                            java.util.Locale locale,
                                            org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
                                            java.lang.String tabName)
                                     throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                            java.io.IOException
        Specified by:
        outputConfigurationBody in interface org.apache.manifoldcf.core.interfaces.IConnector
        Overrides:
        outputConfigurationBody in class org.apache.manifoldcf.core.connector.BaseConnector
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • viewConfiguration

        public void viewConfiguration​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
                                      org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
                                      java.util.Locale locale,
                                      org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
                               throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                      java.io.IOException
        Specified by:
        viewConfiguration in interface org.apache.manifoldcf.core.interfaces.IConnector
        Overrides:
        viewConfiguration in class org.apache.manifoldcf.core.connector.BaseConnector
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        java.io.IOException
      • processConfigurationPost

        public java.lang.String processConfigurationPost​(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
                                                         org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
                                                         org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
                                                  throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Overrides:
        processConfigurationPost in class org.apache.manifoldcf.core.connector.BaseConnector
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • check

        public java.lang.String check()
                               throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
        Test the connection. Returns a string describing the connection integrity.
        Specified by:
        check in interface org.apache.manifoldcf.core.interfaces.IConnector
        Overrides:
        check in class org.apache.manifoldcf.core.connector.BaseConnector
        Returns:
        the connection's status as a displayable string.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
      • getPipelineDescription

        public org.apache.manifoldcf.core.interfaces.VersionContext getPipelineDescription​(org.apache.manifoldcf.core.interfaces.Specification spec)
                                                                                    throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                                                                           org.apache.manifoldcf.agents.interfaces.ServiceInterruption
        Get an output version string, given an output specification. The output version string is used to uniquely describe the pertinent details of the output specification and the configuration, to allow the Connector Framework to determine whether a document will need to be output again. Note that the contents of the document cannot be considered by this method, and that a different version string (defined in IRepositoryConnector) is used to describe the version of the actual document. This method presumes that the connector object has been configured, and it is thus able to communicate with the output data store should that be necessary.
        Specified by:
        getPipelineDescription in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        getPipelineDescription in class org.apache.manifoldcf.agents.output.BaseOutputConnector
        Parameters:
        spec - is the current output specification for the job that is doing the crawling.
        Returns:
        a string, of unlimited length, which uniquely describes output configuration and specification in such a way that if two such strings are equal, the document will not need to be sent again to the output data sstore.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        org.apache.manifoldcf.agents.interfaces.ServiceInterruption
      • addOrReplaceDocumentWithException

        public int addOrReplaceDocumentWithException​(java.lang.String documentURI,
                                                     org.apache.manifoldcf.core.interfaces.VersionContext pipelineDescription,
                                                     org.apache.manifoldcf.agents.interfaces.RepositoryDocument document,
                                                     java.lang.String authorityNameString,
                                                     org.apache.manifoldcf.agents.interfaces.IOutputAddActivity activities)
                                              throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                                     org.apache.manifoldcf.agents.interfaces.ServiceInterruption,
                                                     java.io.IOException
        Add (or replace) a document in the output data store using the connector. This method presumes that the connector object has been configured, and it is thus able to communicate with the output data store should that be necessary.
        Specified by:
        addOrReplaceDocumentWithException in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
        Overrides:
        addOrReplaceDocumentWithException in class org.apache.manifoldcf.agents.output.BaseOutputConnector
        Parameters:
        documentURI - is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.
        pipelineDescription - includes the description string that was constructed for this document by the getOutputDescription() method.
        document - is the document data to be processed (handed to the output data store).
        authorityNameString - is the name of the authority responsible for authorizing any access tokens passed in with the repository document. May be null.
        activities - is the handle to an object that the implementer of a pipeline connector may use to perform operations, such as logging processing activity, or sending a modified document to the next stage in the pipeline.
        Returns:
        the document status (accepted or permanently rejected).
        Throws:
        java.io.IOException - only if there's a stream error reading the document data.
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        org.apache.manifoldcf.agents.interfaces.ServiceInterruption
      • noteJobComplete

        public void noteJobComplete​(org.apache.manifoldcf.agents.interfaces.IOutputNotifyActivity activities)
                             throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
                                    org.apache.manifoldcf.agents.interfaces.ServiceInterruption
        Notify the connector of a completed job. This is meant to allow the connector to flush any internal data structures it has been keeping around, or to tell the output repository that this is a good time to synchronize things. It is called whenever a job is either completed or aborted.
        Specified by:
        noteJobComplete in interface org.apache.manifoldcf.agents.interfaces.IOutputConnector
        Overrides:
        noteJobComplete in class org.apache.manifoldcf.agents.output.BaseOutputConnector
        Parameters:
        activities - is the handle to an object that the implementer of an output connector may use to perform operations, such as logging processing activity.
        Throws:
        org.apache.manifoldcf.core.interfaces.ManifoldCFException
        org.apache.manifoldcf.agents.interfaces.ServiceInterruption