数据库优化——postgresql分区与继承

2/13/2017来源:SQL技巧人气:3827

数据库优化——postgresql分区与继承

前言

在实际项目开发中经常需要对各种操作日志进行保存,时间一久数据量就变得很大,所以当面临对这些表直接查询时,往往都会耗费较长的时间,从而影响用户的体验。由于这些表中都是会保存时间序列,并且在具体业务中时间跨度比较小,所以可以通过按月或者按天的操作来进行分表,从而降低查询的代价,提高查询的速度。在postgresql中,可以利用内置的INHERIENTS机制来实现分区,降低整个代价。

在postgrelsql文档中,主要描述了以下几个优点: 1)当分区之后,大多数对该表的查询都集中在一个或者几个分区内,查询性能会得到大幅度的提升。因为其降低了索引的大小,使得其索引结构在内存中的结构更合理。 Ps:只是知道数据库中的索引结构是B-TREE或者B+TREE,这个怎么优化了不是很了解。 2)当查询或者更新某个分区的大部分条目时,可以利用该分区的序列扫描而不用借助索引或者随机访问的方式来进行,因此也能改善该性能。 3)对某一块(某段时间或者某个范围内)数据的加载和删除时,可以明显改善性能 4)很少访问,使用的数据可以移动到成本更低的低速存储设施中。

Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can PRovide several benefits: Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory. When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table. Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. ALTER TABLE NO INHERIT and DROP TABLE are both far faster than a bulk Operation. These commands also entirely avoid the VACUUM overhead caused by a bulk DELETE. Seldom-used data can be migrated to cheaper and slower storage media.

一 如何创建分区

1.1 创建主表

CREATE TABLE operation_record( id int notnull, operate_date Long notnull );

1.2 创建继承分区

CREATE TABLE tableName _y2006m02( CHECK(operate_date >=DATE'2006-02-01' AND operate_date <DATE'2006-03-01'))INHERITS(measurement);

在实际的开发中,博主是在插入数据时实时检测分区是否存在,不存在则创建。 由于是在Spring中进行开发,此处博主用的是JDBCTemplate(可以通过@Autowired和@Qualifier来按bean名进行区分注入)来进行数据库的DDL操作的。代码如下:

private synchronized void checkExistTable(Long operateTime) { Long month = Long.valueOf(DateFormatUtils.format(operateTime, DateTimeUtil.YYYYMM)); if(existDate.contains(month)) { return; } boolean isExist = operationRecordMapper.checkOperateRecordTable("operate_record_" + month); if(!isExist) { Date startDate = DateUtils.truncate(new Date(operateTime), Calendar.MONTH); Date endDate = DateUtils.addMonths(startDate, 1); jdbcTemplate.execute("create table operation_record_" + month + "(check(operation_time>= " + startDate.getTime() + " and operation_time<" + endDate.getTime() + "), LIKE operation_record including CONSTRAINTS including DEFAULTS including indexes) INHERITS(operation_record)"); jdbcTemplate.execute("create rule rule_operation_record_" + month + " as on insert to operation_record WHERE (operation_time >=" + startDate.getTime() + " and operation_time< " + endDate.getTime() + ") DO INSTEAD INSERT INTO operation_record_" + month + "(id, target_type, reference_id, operation_type, content, remark, operation_user, operation_time) VALUES (NEW.id, NEW.target_type, NEW.reference_id, NEW.operation_type, NEW.content, NEW.remark, NEW.operation_user, NEW.operation_time)"); existDate.add(month); } }

1.3定义约束或者索引

CREATE INDEX measurement_y2006m02_opeatedate ON table_y2006m02(operate_date);

1.4 定义规则或者(触发器)

REATEORREPLACEFUNCTIONmeasurement_insert_trigger()RETURNSTRIGGERAS$$BEGIN IF(NEW.logdate>=DATE'2006-02-01'AND NEW.logdate<DATE'2006-03-01')THEN INSERTINTOmeasurement_y2006m02VALUES(NEW.*); ELSIF(NEW.logdate>=DATE'2006-03-01'AND NEW.logdate<DATE'2006-04-01')THEN INSERTINTOmeasurement_y2006m03VALUES(NEW.*); ... ELSIF(NEW.logdate>=DATE'2008-01-01'AND NEW.logdate<DATE'2008-02-01')THEN INSERTINTOmeasurement_y2008m01VALUES(NEW.*); ELSE RAISEEXCEPTION'Date out of range. Fix the measurement_insert_trigger() function!'; ENDIF; RETURNNULL;END;$$LANGUAGEplpgsql; CREATETRIGGERinsert_measurement_trigger BEFOREINSERTONmeasurement FOREACHROWEXECUTEPROCEDUREmeasurement_insert_trigger();

定义规则:

create rule rule_operation_record_" + month + " as on insert to operation_record WHERE (operation_time >=" + startDate.getTime() + " and operation_time< " + endDate.getTime() + ") DO INSTEAD INSERT INTO operation_record_" + month + "(id, target_type, reference_id, operation_type, content, remark, operation_user, operation_time) VALUES (NEW.id, NEW.target_type, NEW.reference_id, NEW.operation_type, NEW.content, NEW.remark, NEW.operation_user, NEW.operation_time)"); existDate.add(month);