Lessons Learned

contact@vinta.com.br

DataError: integer out of range in psycopg may be caused when you have a table with an integer auto-incrementing id and you insert more than 2^31-1 records. Note that if you TRUNCATE the table, the id sequence DOES NOT reset. Use ALTER SEQUENCE sequence-name RESTART WITH 1; for that. Find out the sequence name with \d+ table_name.

Did you like?
1

If you have 4 parallel workers and need to process all rows from a SQL table, probably it's best to split the work with LIMIT/OFFSET into 4 parts and consume them with 4 tasks than to chose some smaller LIMIT/OFFSET and generate more than 4 tasks. OFFSET can get really slow and it doesn't build up speed when you make consecutive ones like OFFSET 1000, OFFSET 1100, OFFSET 1200, etc. In Python: limit = math.ceil(n_rows / n_workers) offsets = range(0, n_rows, limit)

Did you like?
1