Steps for importing csv / text data to Postgres database table in Ubuntu / Linux
- If the file is too big to process in one step, cut the file into multiple files
- split -l 250000 filename.txt // Will split the file into multiple files with 250000 lines
- In case if all columns are not required to be imported, create a new file with required number of columns. Mind the delimiter used.
- cut -d'|' -f1,3,6,33,34,35 file_name.txt >> out.txt
- Next two lines are specific to my case
- sed -e '/^[a-z,A_Z]/d' test.txt > test_1.txt // To delete the lines which starts with a character.
- sed -i '/^$/d' file.txt //To delete empty lines
- Copy the first line which is the column names to all the splitted file outputs, except the first file. Use "vim".
- I used a python program to delete lines which contains number of columns not equal to three.
#Python Programfh = open('test.txt') # Open the Original filef = open("test_2.txt","w") # Create a new file to store the outputfor line in fh:countLine = line.count("|")if countLine == 2:f.write(line)fh.close()
- Create a table with requited columns. Better to give data type as character varying and size as larger than reqired for all columns (eg character varying (400))
- Copy the formatted txt / csv to postgres table. Here delimiter used is '|'.
- COPY table_name(phone_no,subscriber_name,address,pincode,place,district) FROM 'local_location/filename.txt' DELIMITER '|' CSV HEADER;
- Possible Errors
- Error telling some special characters present, so not able to copy. /* "UTF8": 0xbf */
- Here i have deleted the lines with that special characters. Go this website and figure out the special character.
- https://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=0x ¿
- Using vim editor search for the special character in file and delete it manually.
- After copying the file, do the required process in the table
- --delete FROM table_name WHERE length(phone_no)!=10;
- --delete FROM table_name WHERE length(phone_no)=0;
- --delete FROM table_name WHERE phone_no is null
- --delete FROM table_name WHERE length(subscriber_name)>95;
- --delete FROM table_name WHERE length(address)>295;
- Alter the column datatype to required one
- ALTER TABLE table_name ALTER COLUMN phone_no TYPE bigint USING (phone_no::bigint); //Column name is phone_no
- Fill unfilled column in database table with value from another column, delemited value
- update table_name set pincode = ( SELECT split_part(address, '!', 3));
- Update Column by replacing occurance of a string
- UPDATE
SET = REPLACE ( , 'string1', 'string2' );
Steps for take postgres table dump and import to another Postgres database
- Login as root user
- sudo su
- Login as postgres user
- su - postgres
- Take dump of the table to file named output.sql
- pg_dump -p 5432 --column-inserts --data-only --table=table_name database_name > output.sql
- Copy output.sql to postgres/data folder of the machine where it is to be ported.
- Create table at destination with same table name and properties
- Go inside the /postgres/data folder and give this command
- psql -p 5444 -d database_name -f output.sql
- -p is for port number
- -d for database name