Timing Data
Textract
The Textract Timings.xlsx file data was gathered by putting the files on S3 and using the list of file/objects to programmatically execute a series of start_document_text_detection() calls in Python in quick succession. All of the 16:XX (mi:ss) times listed in the file are with reference to Tue, 18 Jan 2022 02:16:XX GMT, when the script was executed. The file information is repeated in the following table.
File | Pages | Response Start (mi:ss) | S3 Access Check (mi:ss) | Last Modified (mi:ss) | Duration (s) | Duration/Page |
2020021298_ori.tif | 10 | 16:11 | 16:12 | 16:33 | 22 | 2.20 |
2021090550_ori.tif | 23 | 16:11 | 16:12 | 16:37 | 26 | 1.13 |
20210000019_1.tif | 1 | 16:11 | 16:13 | 16:28 | 17 | 17.00 |
20210000051_4.tif | 4 | 16:11 | 16:13 | 16:30 | 19 | 4.75 |
20210000052_2.tif | 2 | 16:11 | 16:13 | 16:28 | 17 | 8.50 |
20210000088_2.tif | 2 | 16:12 | 16:13 | 16:29 | 17 | 8.50 |
20210000229_269.tif | 269 | 16:12 | 16:13 | 17:57 | 105 | 0.39 |
20210000581_25.tif | 25 | 16:12 | 16:13 | 16:39 | 27 | 1.08 |
20210002141_2.tif | 2 | 16:12 | 16:14 | 16:30 | 18 | 9.00 |
20210002155_1.tif | 1 | 16:12 | 16:14 | 16:28 | 16 | 16.00 |
20210002184_5.tif | 5 | 16:12 | 16:14 | 16:32 | 20 | 4.00 |
20210002185_19.tif | 19 | 16:14 | 16:14 | 16:38 | 24 | 1.26 |
Comprehend
The Comprehend Timings.xlsx file data was gathered by submitting Comprehend jobs in a pipeline.
The first job in the list failed (pointing to an input that didn’t exist). There were none of our jobs in the pipeline at this point (after the failure).
The second job completed, but no other jobs were submitted until it completed 8 minutes later. This 8-minute time was something we saw previously for single jobs.
The remaining jobs were submitted with a decreasing time between jobs (400s, 200s, 100s, …, 4s, 4s). This didn’t eliminate a long processing time, it still took at least 6 minutes per job.
The Excel file information is repeated in the following table.
Pages | S3 Text Size (kb) | Job ID | Status | Submit Time | End Time | Duration (s) | Duration (mi:ss) | Time since Last Submission | Active Prior To Submission |
19 | 58.7 | e8697da65666e8d4f1c2cc8ba445030b | FAILED | 2022-01-21 16:36:17 | 2022-01-21 16:36:25 | 8 | 00:08 | 00:05 | 0 |
10 | 21.5 | a35bb0ee1e3e4dc8aa9e95063022ad15 | COMPLETED | 2022-01-21 16:46:20 | 2022-01-21 16:54:30 | 490 | 08:10 | 10:03 | 0 |
10 | 21.5 | 07367b8d833c5b4bf107c212c211c641 | COMPLETED | 2022-01-21 16:56:32 | 2022-01-21 17:04:40 | 488 | 08:08 | 10:12 | 0 |
23 | 48.5 | 35b7b5f8393ff38ddefce992e45e9acf | COMPLETED | 2022-01-21 17:03:12 | 2022-01-21 17:11:19 | 487 | 08:07 | 06:40 | 1 |
1 | 1.2 | 3feeeee8219cdadbb49a6fedecbabdf0 | COMPLETED | 2022-01-21 17:06:32 | 2022-01-21 17:12:39 | 367 | 06:07 | 03:20 | 1 |
4 | 4.1 | e8926369f926e44af2382d5f63971d46 | COMPLETED | 2022-01-21 17:08:12 | 2022-01-21 17:14:19 | 367 | 06:07 | 01:40 | 2 |
2 | 0.7 | ed73480f29d63280bde7f1758622b547 | COMPLETED | 2022-01-21 17:09:03 | 2022-01-21 17:15:10 | 367 | 06:07 | 00:51 | 3 |
2 | 2.1 | 3f50f507d913947b228ab13532f966e4 | COMPLETED | 2022-01-21 17:09:28 | 2022-01-21 17:15:35 | 367 | 06:07 | 00:25 | 4 |
269 | 372.9 | f2e9be936ccc207feb4aa7fd9ac83765 | COMPLETED | 2022-01-21 17:09:40 | 2022-01-21 17:17:48 | 488 | 08:08 | 00:12 | 5 |
25 | 73.4 | 7141079239ab29e0e16304468eb695fe | COMPLETED | 2022-01-21 17:09:47 | 2022-01-21 17:15:54 | 367 | 06:07 | 00:07 | 6 |
2 | 3.3 | 11584d547020949a80cd63c440c8f178 | COMPLETED | 2022-01-21 17:09:51 | 2022-01-21 17:16:00 | 369 | 06:09 | 00:04 | 7 |
1 | 3.2 | 95c668091d4e5b043116a39eda7ccafb | COMPLETED | 2022-01-21 17:09:55 | 2022-01-21 17:16:02 | 367 | 06:07 | 00:04 | 8 |
5 | 10.5 | 8331004666d0e3ac0038febe87e75f3a | COMPLETED | 2022-01-21 17:09:59 | 2022-01-21 17:16:06 | 367 | 06:07 | 00:04 | 9 |
19 | 58.7 |
| EXCEPTION: Too many requests. |
|
|
|
| 00:04 |
|