updating millions of rows in mysql -- when to commit

  • Last Update :
  • Techknowledgy :

I have a for loop that goes through millions of objects. What would be the suggested way to commit this? Here are a few examples I was thinking of:

# after each
for item in items:
   cursor.execute()
conn.commit()

# at the end
for item in items:
   cursor.execute()
conn.commit()

# after N items
for n, item in enumerate(items):
   cursor.execute()
if n % N == 0:
   conn.commit()
conn.commit()

Suggestion : 2

How to Update millions or records in a table,Connor and Chris don't just spend all day on AskTOM. You can also catch regular content via Connor's blog and Chris's blog. Or if video is more your thing, check out Connor's latest video and Chris's latest video from their Youtube channels. And of course, keep up to date with AskTOM via the official twitter account. ,Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

ok, say you wanted to update emp to set ename = lower(ename).Instead, you could do this:

      ops$tkyte @ORA817DEV.US.ORACLE.COM > create table new_emp as
   2 select empno, LOWER(ename) ename, JOB,
   3 MGR, HIREDATE, SAL, COMM, DEPTNO
4 from emp;

Table created.

ops$tkyte @ORA817DEV.US.ORACLE.COM >
   ops$tkyte @ORA817DEV.US.ORACLE.COM > drop table emp;

Table dropped.

ops$tkyte @ORA817DEV.US.ORACLE.COM > rename new_emp to emp;

Table renamed.
The update operation can be made simple by sepearting out the query and update operation.To do that create another table which stores the rowids of the records in the original table which has to be updated along with the the value to be updated.Then run the PL / SQL script to update the records one by one.To test the above method the following scripts can be used.

   SQL > create table test_update(id number, name varchar2(100), description varchar2(4000)) storage initial(48 M next 4 m);

SQL > declare
v_n number;
v_name number;
v_desc number;
i number;
begin
for i in 1. .1000000 LOOP
insert into test_update(id, name, description) values(i, 'Test Name' || i, 'Test Name ' || i || ' description ');
END LOOP;
end;
/

Elapsed: 00: 04: 277.23

The above script will insert 1 million rows.

SQL > select count( * ) from test_update where description like '%5 description%';

COUNT( * )
   -- -- -- -- --
100000

Elapsed: 00: 00: 02.63

SQL > create table test_update_rowids as select rowid rid, description from test_update where description like '%5 description%';

Elapsed: 00: 00: 54.58

The table test_update_rowids stores the rowids and the new values that has to be updated.ie 100000 rows needs to be updated.

SQL > declare
begin
for c1 in (select rid, description from test_update_rowids)
LOOP
update test_update set description = c1.description || ' after update'
where rowid = c1.rid;
END LOOP;
end;
/

Elapsed: 00: 01: 82.17

The above script performs the updation.

Mohan
I ran the update statement as a single SQL statement and the time elapsed is slightly more than the above method.It has to be tested against update statements containing complex queries or join operations.

SQL > update test_update set description = description || ' after update '
where description like '%5 description%';

Elapsed: 00: 01: 100.13

Mohan
what is the OTHER table you are updating from ? you must have a detail table elsewhere from where you derive that 15 and 25. So, assuming something like this :

   ops$tkyte @ORA920 > create table t(clientid int, month int, year int, quantity int);

Table created.

ops$tkyte @ORA920 > create table txns(clientid int, month int, year int);

Table created.

ops$tkyte @ORA920 >
   ops$tkyte @ORA920 > insert into t values(1, 7, 2003, 10);

1 row created.

ops$tkyte @ORA920 > insert into t values(2, 7, 2003, 20);

1 row created.

ops$tkyte @ORA920 >
   ops$tkyte @ORA920 > insert into txns select 1, 7, 2003 from all_objects where rownum <= 15;

15 rows created.

ops$tkyte @ORA920 > insert into txns select 2, 7, 2003 from all_objects where rownum <= 25;

25 rows created.

ops$tkyte @ORA920 >
   ops$tkyte @ORA920 >
   ops$tkyte @ORA920 > select * from t;

CLIENTID MONTH YEAR QUANTITY
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
1 7 2003 10
2 7 2003 20

ops$tkyte @ORA920 > update t
2 set quantity = quantity + (select count( * ) 3 from txns 4 where txns.clientid = t.clientId 5 and txns.month = t.month 6 and txns.year = t.year);

2 rows updated.

ops$tkyte @ORA920 > select * from t;

CLIENTID MONTH YEAR QUANTITY
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
1 7 2003 25
2 7 2003 45

there you go.

(it is a shame you are using the wrong datatypes--only a DATE should be used to hold, well, DATES.Using numbers to hold a year and a month isn 't a good practice)
ops$tkyte @ORA920LAP > CREATE TABLE t
2(
   3 data char(255),
   4 temp_date date 5)
6 PARTITION BY RANGE(temp_date)(
   7 PARTITION part1 VALUES LESS THAN(to_date('13-mar-2003', 'dd-mon-yyyy')),
   8 PARTITION part2 VALUES LESS THAN(to_date('14-mar-2003', 'dd-mon-yyyy')),
   9 PARTITION part3 VALUES LESS THAN(to_date('15-mar-2003', 'dd-mon-yyyy')),
   10 PARTITION part4 VALUES LESS THAN(to_date('16-mar-2003', 'dd-mon-yyyy')),
   11 PARTITION part5 VALUES LESS THAN(to_date('17-mar-2003', 'dd-mon-yyyy')),
   12 PARTITION part6 VALUES LESS THAN(to_date('18-mar-2003', 'dd-mon-yyyy')),
   13 PARTITION junk VALUES LESS THAN(MAXVALUE) 14)
15;

Table created.

ops$tkyte @ORA920LAP >
   ops$tkyte @ORA920LAP > create index t_idx1 on t(temp_date) LOCAL nologging;

Index created.

ops$tkyte @ORA920LAP >
   ops$tkyte @ORA920LAP > alter index t_idx1 unusable;

Index altered.

ops$tkyte @ORA920LAP >
   ops$tkyte @ORA920LAP > begin
2
for x in (select 'alter index ' || index_name ||
   3 ' rebuild partition ' || partition_name stmt 4 from user_ind_partitions 5 where index_name = 'T_IDX1')
6 loop
7 dbms_output.put_line(x.stmt);
8 execute immediate x.stmt;
9 end loop;
10 end;
11 /
   alter index T_IDX1 rebuild partition PART1
alter index T_IDX1 rebuild partition PART2
alter index T_IDX1 rebuild partition PART3
alter index T_IDX1 rebuild partition PART4
alter index T_IDX1 rebuild partition PART5
alter index T_IDX1 rebuild partition PART6
alter index T_IDX1 rebuild partition JUNK

PL / SQL procedure successfully completed.

Suggestion : 3

For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. Each matching row is updated once, even if it matches the conditions multiple times. For multiple-table syntax, ORDER BY and LIMIT cannot be used. , Single-table UPDATE assignments are generally evaluated from left to right. For multiple-table updates, there is no guarantee that assignments are carried out in any particular order. , Instead, you can employ a multi-table update in which the subquery is moved into the list of tables to be updated, using an alias to reference it in the outermost WHERE clause, like this: , The preceding example shows an inner join that uses the comma operator, but multiple-table UPDATE statements can use any type of join permitted in SELECT statements, such as LEFT JOIN.

Single-table syntax:

UPDATE[LOW_PRIORITY][IGNORE] table_reference
SET assignment_list
   [WHERE where_condition]
   [ORDER BY...]
   [LIMIT row_count]

value: {
   expr | DEFAULT
}

assignment:
   col_name = value

assignment_list:
   assignment[, assignment]...

Multiple-table syntax:

UPDATE[LOW_PRIORITY][IGNORE] table_references
SET assignment_list
   [WHERE where_condition]

If you access a column from the table to be updated in an expression, UPDATE uses the current value of the column. For example, the following statement sets col1 to one more than its current value:

UPDATE t1 SET col1 = col1 + 1;

If an UPDATE statement includes an ORDER BY clause, the rows are updated in the order specified by the clause. This can be useful in certain situations that might otherwise result in an error. Suppose that a table t contains a column id that has a unique index. The following statement could fail with a duplicate-key error, depending on the order in which rows are updated:

UPDATE t SET id = id + 1;

For example, if the table contains 1 and 2 in the id column and 1 is updated to 2 before 2 is updated to 3, an error occurs. To avoid this problem, add an ORDER BY clause to cause the rows with larger id values to be updated before those with smaller values:

UPDATE t SET id = id + 1 ORDER BY id DESC;