Articles Database

liunx · Mar 16, 2008

What would be the best way to use a database of articles?

Should I have a field with the url of the article in, and have a set of keywords in there as well?

Or would it be better to have the entire article in the database (probably about 10,000 characters per article)?

BTW - I want to use MySQL for this.you could have the article in there or add a link to the folder it is in and have the article in a file. doesn't really matter, it is all personal preference here

sorry - took me a while to get round to reading this.

so, if the file is not in the database, how do you search the entire contents of the files?well it should at least have the name of the file in it, you can also put another column in it called search terms and search on that as well.okay - i think i need to turn this question on its head a little.

How much of a cost in terms of DB size and query time would it be to have the entire article text in the DB as against a list of keywords and a link?all depends on how many articles you will have. see if you stick the whole thing in the db, then you can't index that column, therefore you slow it down. I would stick the title in it and a search column and index both of them and run it that way. then you get better results and a smaller db. but that is all up to you. see this forum has 1000's of rows and it is still fast, so the whole artice in it won't slow it down too much, you just can't have it indexed.index?

sorry - bit of a db noobI suspected that one.

indexing is where you can make the db query faster.

Indexes are used to find rows with a specific value of one column fast. Without an index MySQL has to start with the first record and then read through the whole table until it finds the relevant rows. The bigger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly get a position to seek to in the middle of the datafile without having to look at all the data. If a table has 1000 rows, this is at least 100 times faster than reading sequentially. Note that if you need to access almost all 1000 rows it is faster to read sequentially because we then avoid disk seeks.ah - thanks Scoutt.

I've got a bit to digest again now.Scoutt,
I dont understand why he would lose the index by puting the whole file in the database.

Just create the table with like four fields (ID,TITLE,SEARCH_TERMS,ARTICLE) and have the ARTICLE field be like a length of 20,000. Sure the overhead on the DB might be much if he does do a "Full Text Search" but I dont see why it wouldn't be possible. Unless you're saying that you'd turn mySQL into some sort of File Database, but I'm not sure how'd that work.

Or, if you wanted to keep the overhead on the DB down, you could just store the file name, title and search fields and then accept the processing loss of opening each file during the search to look for their search criterias, but that is a BIG PROCESSING TIME LOSS exponential to the number of articles you have.

If you really need the ability to do "Full Text Searches" then I'd accept the overhead on the DB and go with the first route.last I read you can't index text fields. and in order to store the article in the db it would have to be TEXT and not varchar.Oh okay, you're talking about an actual index (most people when they say index actually mean primary key).

I was referring to having a primary key in your table as opposed to an index on the field itself, therefore if you know the primary key, it's still easy to find, and you're not forced to store the file externally.hehe that makes sense. nah you can have those all the time.

Articles Database

liunx

Guest