State machine scanning - analyze a document
Description
The following state machine presents a document scanning process. The process starts with an INITIAL state. Throughout its creation process, the document could have the following state :
UPLOADED: the action SCAN is triggered by the cloud function onFinalyze. Guards allow to go to state UPLOADED only if :- size of the scan is not empty AND
- mime type is strictly .png or .jpg
Elements of the payload : the id of the scanned file within the scan collection is the same than the one given at scanning time
EXTRACTING: the action EXTRACT is triggered by Amazon textract web service
As a result, object can get one of the 2 following states :
UNREADABLE: action FAIL EXTRACT => Textract function returns an error AND/OR the payload is empty => end of state machine ;UNTYPED: action SUCCESS EXTRACT => Textract function returns a correct payload Note : the payload has currently no type and should be considered as any type returned by textract.
From UNTYPED, next step is to select a type :
TYPED: action SELECT TYPE gives a specific type to the scanned document. The type is based on the nature of the document detected by Textract, and is selected amongst the following list (to be completed) :
| nature | type |
|---|---|
| bank Id statement | BIS |
| id card | idCard |
| passport Id | idPassport |
| driving licence Id | idDrivingLicence |
| diagnostic energy | diagEnergy |
| diagnostic asbestos | diagAsbestos |
| diagnostic lead | diagLead |
| diagnostic gas | diagGas |
| diagnostic electricity | diagElectricity |
| diagnostic natural hazard | diagNaturalHazard |
| diagnostic surface Carrez | diagSurfCarrez |
| diagnostic surface Boutin | diagSurfBoutin |
| propertyTax | TaxProperty |
| rental receipt | ReceiptRental |
if Textract is not able to determine a type, then file is turned into UNREADABLE state => end of state machine
ANALYZED: once typed, only relevant informations are retreived from the global amazon payload, according to the type given on the previous step. These informations are represented through an object : "resultingObject".
Once in ANALYZED state, this object will be used in other state machine : see assigning a document.
Then informations contained in the object will be encoded => to get a structure and correspondance between resultingObjet descibed above and the used of them, see encoding a document
Collection
The collection below is standalone, meaning it is not linked to the whole model.