## Models: - `testConfig` - S3 Path: /tests/definitions/{testId}.json - Properties: - id: string - description: string - documentSet: string - workflow_deployment_name: string //from vellum - release_tag: string //from vellum - lastRun: { - path //s3 path to the run - dateTime //the date and time the ran began - latency //ms it took to run - cost //cost, to 4 decimal places - accuracy //decimal from 0 to 1 representing the number of correct responses } - testResult - id: string //8 digit base 62 random string - dateTime: //date time the test started - documentSetId: //the id of the document set that was run - documents: //array - documentId //string - expectedDocument: json object that should match the document - responseDocument: json object //the document part of the response - responseMetadata: json object //the metadata part of the response - incorrectFields - jsonPath - expected - actual - success //bool - only if there are no incorrect fields - error - usage - cost - latency - usage - cost - latency - accuracy - error: string - testResultSummary - id: testResult.id - datetime: testResult.dateTime - usage: testResult.usage - cost: testResult.cost - latency: testResult.latency - accuracy: testResult.accuracy - error: testResult.error - documentSetId: testResult.documentSetId - testGroup - id: string - description: string - testIds: string[] - lastRun: string - testDocumentSet - id: string - description: string - documentIds: string - id: string - expectedDocument: json object to compare the result of the test to - documentActual - ## S3 Paths - /tests/definitions/{testId}.json - /tests/groups/{testGroupId}.json - /tests/definitions-archived/{testId}.json - When a test is archived, move it here - /tests/results/{testId}/{yyyyMMddHHmmss}/{id}.json - Contains a `testResult` object. The date/time is for when that group of tests (or single test) was started. - /tests/results/{testId}/{yyyyMMddHHmmss}/{id}-summary.json - a smaller summary of the `testResult` - as a `testResultSummary` - /documents/{id} - /documents-processed/{id} - /documents/test-sets/{setId}.json - /documents-actuals/{id} - contains the expected json result of the document ## APIs: - - POST /api/ai-tools/process-document - JSON BODY - { "document-s3-key": "path/on/s3.pdf" } - RETURNS - Result - Cost - Usage - run a test - POST /api/ai-tools/test/{test-id}/run - No body - Loads the given testConfig from S3, and calls "process-document" for every document in the test config(parallelize to process 10 at a time) - Updates the testConfig.json with the path to the test-results - create a test - POST /api/ai-tools/create-test - list tests - GET /api/ai-tools/list-tests?isArchived=true|false - Lists all the testConfigs from S3 - if isArchived=true, only show the archived tests - if isArchived=false, do not show the archived tests - get test result - GET /api/ai-tools/test/{test-id}/result?timestamp={testResult:yyyyMMddHHmmss}&id={testResultId} - list test result summaries - GET /api/ai-tools/test/{test-id}/result-summaries - Grabs all the testSummaryResult entries for this test - list document sets: returns the name and description of all the document sets. this should be cached on the server - create document set - edit document set - update document set expected value ## Screens: - Login: `/login` use hardcoded user and password - Dashboard: `/dashboard` - Lists all the testConfigs - Click a toggle to show the archived tests - Show: - Name - Description - Last Run date / cost / accuracy - Click the test config to go to the test screen - Button to create test - Show modal - Enter Id (textbox) and Description (textarea) - Choose a document set (dropdown) - Enter a deployment name - Enter a deployment tag - Create the test - Refresh the dashboard to include the test - Can select multiple tests to create a test group - Lists the test groups - a test group is a list of tests you can run all at once - Test Details `/test/{testId}` - Name and description of the test - List of recent test executions. Group by testDocumentSet - date/time - cost - accuracy - documentGroup - Click a button to run the test - Choose a testDocumentSet to run it on - Test Results: `/test/{testId}/{testResultDate:yyyyMMddHHmmss}/{testResultId}` - By default, only show the documents where success = false (there is at least one incorrect field) - Can click a toggle to show all the documents - Split pane view. When you click on a document in the left pane, show the document in the right pane (collapsed by default). - For each document, show the expected and actual values for each field (show only mismatches by default, can click to show all). Click on a field's "expected" value to modify it inline. Automatically save the updated value in the document set