PostgreSQL vs Hadoop for large amounts of data storage and retrieval [closed]

up vote
-1
down vote

favorite

Note: this question exists on dba.SE, but has no answers and practically no views. So I'm posting it here in the hopes that it will get wider attention.

I have recently been tasked with migrating a large volume of data stored within various excel sheets and CSV files into a structured database. The amount of data to process is enormous, well within the range of multiple Terabytes. The aim is to provide a quick data retrieval system and provide statistics about the data.

Since I have years of experience with relational databases, especially Postgres, my first thought was to analyze the data and migrate it to a Postgres DB. However, I recently read about "Big Data", and I see Hadoop being mentioned in many places. I have no experience whatsoever within this field, so I'm inclined to not use these frameworks, however it looks like this is the standard in storing and processing large amounts of data.

After spending some time on Google, it is still not entirely clear to me what the Big Data paradigm really is and how to "set up a Hadoop cluster". I know that it aims to resolve speed issues when retrieving data from a very large DB, but I still fail to understand where this "DB" is, i.e., is it Hadoop itself, is it some proprietary model, can it be a Postgres DB, ...?

My questions are:

Is it worth learning the Big Data paradigm and implement a solution based on Hadoop?

Can I get away with using a well-structured Postgres database instead?

Can I migrate my Postgres solution to some kind of Big Data structure if it turns out that it is better?

edited Nov 9 at 15:38

asked Nov 9 at 15:29

Flermat

466825

closed as too broad by Denys Séguret, ChrisF♦ Nov 12 at 13:42

Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

My questions are clear and concise. Moreover, adequate background information is provided and the ensemble is well structured. Yet it is put on hold as "unclear". This is why this site is losing popularity. Note: the same question has +2 reputation on dba.SE. Figures.
– Flermat
Nov 12 at 14:14

Your question was placed on hold because you ask multiple questions at once and ask for opinion based/resource requests, which are 3 closure reasons this question can fall under, I personally believe your sister question should be closed on DBA stack too, but I am not frequent over there
– WhatsThePoint
Nov 12 at 15:54

I don't see how my questions involve opinions of any kind. There is a problem, and there are two solutions. I need to know which one of the solutions is most suited to the problem. Feel free to block the answers that are opinion-based. The question is not.
– Flermat
Nov 12 at 16:10

Should I do X will always be an opinion based question
– WhatsThePoint
Nov 12 at 16:29

1

I totally agree with OP ... someone help the man!
– Veljko89
Nov 13 at 8:18

add a comment |

up vote
-1
down vote

favorite

Note: this question exists on dba.SE, but has no answers and practically no views. So I'm posting it here in the hopes that it will get wider attention.

My questions are:

Is it worth learning the Big Data paradigm and implement a solution based on Hadoop?

Can I get away with using a well-structured Postgres database instead?

Can I migrate my Postgres solution to some kind of Big Data structure if it turns out that it is better?

edited Nov 9 at 15:38

asked Nov 9 at 15:29

Flermat

466825

closed as too broad by Denys Séguret, ChrisF♦ Nov 12 at 13:42

My questions are clear and concise. Moreover, adequate background information is provided and the ensemble is well structured. Yet it is put on hold as "unclear". This is why this site is losing popularity. Note: the same question has +2 reputation on dba.SE. Figures.
– Flermat
Nov 12 at 14:14

Your question was placed on hold because you ask multiple questions at once and ask for opinion based/resource requests, which are 3 closure reasons this question can fall under, I personally believe your sister question should be closed on DBA stack too, but I am not frequent over there
– WhatsThePoint
Nov 12 at 15:54

I don't see how my questions involve opinions of any kind. There is a problem, and there are two solutions. I need to know which one of the solutions is most suited to the problem. Feel free to block the answers that are opinion-based. The question is not.
– Flermat
Nov 12 at 16:10

Should I do X will always be an opinion based question
– WhatsThePoint
Nov 12 at 16:29

1

I totally agree with OP ... someone help the man!
– Veljko89
Nov 13 at 8:18

add a comment |

up vote
-1
down vote

favorite

Note: this question exists on dba.SE, but has no answers and practically no views. So I'm posting it here in the hopes that it will get wider attention.

My questions are:

Is it worth learning the Big Data paradigm and implement a solution based on Hadoop?

Can I get away with using a well-structured Postgres database instead?

Can I migrate my Postgres solution to some kind of Big Data structure if it turns out that it is better?

edited Nov 9 at 15:38

asked Nov 9 at 15:29

Flermat

466825

Note: this question exists on dba.SE, but has no answers and practically no views. So I'm posting it here in the hopes that it will get wider attention.

My questions are:

Is it worth learning the Big Data paradigm and implement a solution based on Hadoop?

Can I get away with using a well-structured Postgres database instead?

Can I migrate my Postgres solution to some kind of Big Data structure if it turns out that it is better?

postgresql hadoop

edited Nov 9 at 15:38

asked Nov 9 at 15:29

Flermat

466825

edited Nov 9 at 15:38

asked Nov 9 at 15:29

Flermat

466825

edited Nov 9 at 15:38

asked Nov 9 at 15:29

Flermat

466825

asked Nov 9 at 15:29

Flermat

466825

asked Nov 9 at 15:29

Flermat

466825

closed as too broad by Denys Séguret, ChrisF♦ Nov 12 at 13:42

My questions are clear and concise. Moreover, adequate background information is provided and the ensemble is well structured. Yet it is put on hold as "unclear". This is why this site is losing popularity. Note: the same question has +2 reputation on dba.SE. Figures.
– Flermat
Nov 12 at 14:14

Your question was placed on hold because you ask multiple questions at once and ask for opinion based/resource requests, which are 3 closure reasons this question can fall under, I personally believe your sister question should be closed on DBA stack too, but I am not frequent over there
– WhatsThePoint
Nov 12 at 15:54

I don't see how my questions involve opinions of any kind. There is a problem, and there are two solutions. I need to know which one of the solutions is most suited to the problem. Feel free to block the answers that are opinion-based. The question is not.
– Flermat
Nov 12 at 16:10

Should I do X will always be an opinion based question
– WhatsThePoint
Nov 12 at 16:29

1

I totally agree with OP ... someone help the man!
– Veljko89
Nov 13 at 8:18

add a comment |

My questions are clear and concise. Moreover, adequate background information is provided and the ensemble is well structured. Yet it is put on hold as "unclear". This is why this site is losing popularity. Note: the same question has +2 reputation on dba.SE. Figures.
– Flermat
Nov 12 at 14:14

Your question was placed on hold because you ask multiple questions at once and ask for opinion based/resource requests, which are 3 closure reasons this question can fall under, I personally believe your sister question should be closed on DBA stack too, but I am not frequent over there
– WhatsThePoint
Nov 12 at 15:54

I don't see how my questions involve opinions of any kind. There is a problem, and there are two solutions. I need to know which one of the solutions is most suited to the problem. Feel free to block the answers that are opinion-based. The question is not.
– Flermat
Nov 12 at 16:10

Should I do X will always be an opinion based question
– WhatsThePoint
Nov 12 at 16:29

1

I totally agree with OP ... someone help the man!
– Veljko89
Nov 13 at 8:18

My questions are clear and concise. Moreover, adequate background information is provided and the ensemble is well structured. Yet it is put on hold as "unclear". This is why this site is losing popularity. Note: the same question has +2 reputation on dba.SE. Figures.
– Flermat
Nov 12 at 14:14

Your question was placed on hold because you ask multiple questions at once and ask for opinion based/resource requests, which are 3 closure reasons this question can fall under, I personally believe your sister question should be closed on DBA stack too, but I am not frequent over there
– WhatsThePoint
Nov 12 at 15:54

I don't see how my questions involve opinions of any kind. There is a problem, and there are two solutions. I need to know which one of the solutions is most suited to the problem. Feel free to block the answers that are opinion-based. The question is not.
– Flermat
Nov 12 at 16:10

Should I do X will always be an opinion based question
– WhatsThePoint
Nov 12 at 16:29

I totally agree with OP ... someone help the man!
– Veljko89
Nov 13 at 8:18

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

The migration from postgres (and classic rgbd) to "big data solution" is clearly time-consuming. If you have the budget you can have some help on public cloud. For example on Amazon you have EMR solution,it pre-package some big-data solution.

But on amazone to you have Redshift spectrum more easy to use : here some talk.

answered Nov 9 at 16:44

Le farfadet

356

add a comment |

up vote
1
down vote

Big Data are terms.. It means the data can from anything like Article, News, Media and other so it is so big that's why the name is Big Data..

Hadoop is free source to implement Big Data If you ask about is it worthy.. Of course nowadays data has become so important.

Big Data do data mining from many data like i said before..

Big Data will give the data from mining and you need to store it to Database but depends on How you do implement Big Data. The data can be stored to NoSql or Rdbms like Postgresql do.. But you need some ETL to transform data because the data is so Big

answered Nov 9 at 22:41

dwir182

add a comment |

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

But on amazone to you have Redshift spectrum more easy to use : here some talk.

answered Nov 9 at 16:44

Le farfadet

356

add a comment |

up vote
1
down vote

But on amazone to you have Redshift spectrum more easy to use : here some talk.

answered Nov 9 at 16:44

Le farfadet

356

add a comment |

up vote
1
down vote

But on amazone to you have Redshift spectrum more easy to use : here some talk.

answered Nov 9 at 16:44

Le farfadet

356

But on amazone to you have Redshift spectrum more easy to use : here some talk.

answered Nov 9 at 16:44

Le farfadet

356

answered Nov 9 at 16:44

Le farfadet

356

answered Nov 9 at 16:44

Le farfadet

356

answered Nov 9 at 16:44

Le farfadet

356

add a comment |

up vote
1
down vote

Big Data are terms.. It means the data can from anything like Article, News, Media and other so it is so big that's why the name is Big Data..

Hadoop is free source to implement Big Data If you ask about is it worthy.. Of course nowadays data has become so important.

Big Data do data mining from many data like i said before..

Big Data will give the data from mining and you need to store it to Database but depends on How you do implement Big Data. The data can be stored to NoSql or Rdbms like Postgresql do.. But you need some ETL to transform data because the data is so Big

answered Nov 9 at 22:41

dwir182

add a comment |

up vote
1
down vote

Big Data are terms.. It means the data can from anything like Article, News, Media and other so it is so big that's why the name is Big Data..

Hadoop is free source to implement Big Data If you ask about is it worthy.. Of course nowadays data has become so important.

Big Data do data mining from many data like i said before..

Big Data will give the data from mining and you need to store it to Database but depends on How you do implement Big Data. The data can be stored to NoSql or Rdbms like Postgresql do.. But you need some ETL to transform data because the data is so Big

answered Nov 9 at 22:41

dwir182

add a comment |

up vote
1
down vote

Big Data are terms.. It means the data can from anything like Article, News, Media and other so it is so big that's why the name is Big Data..

Hadoop is free source to implement Big Data If you ask about is it worthy.. Of course nowadays data has become so important.

Big Data do data mining from many data like i said before..

Big Data will give the data from mining and you need to store it to Database but depends on How you do implement Big Data. The data can be stored to NoSql or Rdbms like Postgresql do.. But you need some ETL to transform data because the data is so Big

answered Nov 9 at 22:41

dwir182

Big Data are terms.. It means the data can from anything like Article, News, Media and other so it is so big that's why the name is Big Data..

Hadoop is free source to implement Big Data If you ask about is it worthy.. Of course nowadays data has become so important.

Big Data do data mining from many data like i said before..

Big Data will give the data from mining and you need to store it to Database but depends on How you do implement Big Data. The data can be stored to NoSql or Rdbms like Postgresql do.. But you need some ETL to transform data because the data is so Big

answered Nov 9 at 22:41

dwir182

answered Nov 9 at 22:41

dwir182

answered Nov 9 at 22:41

dwir182

answered Nov 9 at 22:41

dwir182

add a comment |

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xtykutl