Various parameters for the OCR operation, such as languages to be used as recognition hints.
In scripting languages, a specific null value (e.g., Nothing in VBScript) should be used to indicate the default value.
OCR zones to be recognized as typed values. Zones are organized as a collection of pages to enable zone recognition for multipage images. Valid only if BlockRecognitionMode has value MFOCRBlockRecognitionModeRecognizeSpecifiedBlocks.
In scripting languages, a specific null value (e.g., Nothing in VBScript) should be used to indicate the default value.
Visual Basic |
---|
Public Function PerformOCROperation( _ ByVal ObjVer As ObjVer, _ ByVal FileVer As FileVer, _ Optional ByVal OCROptions As OCROptions = 0, _ Optional ByVal ZoneRecognitionMode As MFOCRZoneRecognitionMode = MFOCRZoneRecognitionModeNoZoneRecognition, _ Optional ByVal ZoneRecognitionPages As OCRPages = 0, _ Optional ByVal ConvertToSearchablePDF As Boolean = True _ ) As OCRPageResults |
Various parameters for the OCR operation, such as languages to be used as recognition hints.
In scripting languages, a specific null value (e.g., Nothing in VBScript) should be used to indicate the default value.
Value | Description |
---|---|
MFOCRZoneRecognitionModeAutoDetectZones | Recognize all auto-detected zones. |
MFOCRZoneRecognitionModeNoZoneRecognition | No zone recognition. |
MFOCRZoneRecognitionModeRecognizeSpecifiedZones | Recognize user-defined zones. |
OCR zones to be recognized as typed values. Zones are organized as a collection of pages to enable zone recognition for multipage images. Valid only if BlockRecognitionMode has value MFOCRBlockRecognitionModeRecognizeSpecifiedBlocks.
In scripting languages, a specific null value (e.g., Nothing in VBScript) should be used to indicate the default value.
The target object must be checked out before using this method. The target file, however, must be available from the latest object version that has been already checked in. Please note that often these preconditions are not satisfied when M-Files event handlers are executed. Therefore, it is not recommended to use this method directly from event handler scripts.
You can, however, call this method from the "Run script" action of a workflow state. This is useful if you need to convert scanned documents to searchable PDF files as part of a workflow state transition. Make sure to call the 'GetFilesForModificationInEventHandler' method prior to calling 'PerformOCROperation' in the "Run script" action of a workflow state. See the code example "Converting scanned files to searchable PDF files in a workflow state action" below.
This method is available only if M-Files API is used in the server interface mode.
The OCR module for M-Files must be installed and activated.
Option Explicit ' Prepare the files of the object for modification by script. Dim files Set files = Vault.ObjectFileOperations.GetFilesForModificationInEventHandler( ObjVer ) ' Prepare OCR options. Dim opts Set opts = CreateObject( "MFilesAPI.OCROptions" ) opts.PrimaryLanguage = MFOCRLanguageEnglishUS opts.SecondaryLanguage = MFOCRLanguageFinnish ' Perform OCR on each of the convertible files. Dim file For Each file In files ' Is the file in a convertible file format? If file.Extension = "tif" Or _ file.Extension = "tiff" Or _ file.Extension = "jpg" Or _ file.Extension = "jpeg" Or _ file.Extension = "pdf" Then ' Convert this file to searchable PDF. Vault.ObjectFileOperations.PerformOCROperation ObjVer, file.FileVer, _ opts, MFOCRZoneRecognitionModeNoZoneRecognition, Nothing, True End If Next
' Initialize the API and connect to a vault. Dim oServerApp As MFilesAPI.MFilesServerApplication = New MFilesAPI.MFilesServerApplication Dim oVault As MFilesAPI.Vault ' ... ' Initialize the object version. Dim oObjectVersion As MFilesAPI.ObjectVersion oObjectVersion = ... ' Check out the object first. We assume that initially the object is checked in. Dim oObjectVersionCheckedOut As MFilesAPI.ObjectVersion oObjectVersionCheckedOut = oVault.ObjectOperations.CheckOut(oObjectVersion.ObjVer.ObjID) ' Simply process all the files of the object. Dim oObjectFiles As MFilesAPI.ObjectFiles = oVault.ObjectFileOperations.GetFiles(oObjectVersionCheckedOut.ObjVer) For Each oObjectFile As MFilesAPI.ObjectFile In oObjectFiles ' Specify OCR options. Dim oOcrOptions As New MFilesAPI.OCROptions oOcrOptions.PrimaryLanguage = MFilesAPI.MFOCRLanguage.MFOCRLanguageFinnish oOcrOptions.SecondaryLanguage = MFilesAPI.MFOCRLanguage.MFOCRLanguageEnglishUS ' Specify an OCR zone to be recognized. Dim oOcrZone As New MFilesAPI.OCRZone oOcrZone.DataType = MFilesAPI.MFDataType.MFDatatypeText oOcrZone.DimensionUnit = MFilesAPI.MFOCRDimensionUnit.MFOCRDimensionUnitMillimeterX10 oOcrZone.ID = 1 oOcrZone.Left = 1650 ' This is interpreted as 165.0 mm. oOcrZone.Top = 220 ' This is interpreted as 22.0 mm. oOcrZone.Width = 300 ' This is interpreted as 30.0 mm. oOcrZone.Height = 100 ' This is interpreted as 10.0 mm. ' Construct an OCR page object and add the OCR zone to this OCR page. Dim oOcrPage As New MFilesAPI.OCRPage oOcrPage.OCRZones.Add(0, oOcrZone) ' Indicate that all zones contained by this OCR page are ' recognized on the page 1 of the source image. oOcrPage.PageNum = 1 ' Construct an OCR page collection and add the OCR page to this collection. Dim oOcrPages As New MFilesAPI.OCRPages oOcrPages.Add(0, oOcrPage) ' Invoke the OCR operation for the target file by requesting ' 1) OCR zone recognition with specific OCR zones, and ' 2) conversion to a searchable PDF. Dim oOcrPageResults As MFilesAPI.OCRPageResults = _ oVault.ObjectFileOperations.PerformOCROperation( _ oObjectVersionCheckedOut.ObjVer, _ oObjectFile.FileVer, _ oOcrOptions, _ MFilesAPI.MFOCRZoneRecognitionMode.MFOCRZoneRecognitionModeRecognizeSpecifiedZones, _ oOcrPages, _ True _ ) ' Process the OCR zone recognition results. For Each oOcrPageResult As MFilesAPI.OCRPageResult In oOcrPageResults For Each oOcrZoneResult As MFilesAPI.OCRZoneResult In oOcrPageResult.OCRZoneResults Call Console.WriteLine("ID: " + CStr(oOcrZoneResult.ID)) Call Console.WriteLine("Recognized value: " + oOcrZoneResult.ResultValue.DisplayValue) Next Next Next ' Check in the object to finalize. oVault.ObjectOperations.CheckIn(oObjectVersionCheckedOut.ObjVer)