Welcome to TalkGraphics.com
Page 1 of 2 12 LastLast
Results 1 to 10 of 13
  1. #1
    Join Date
    Sep 2000
    Location
    London, UK
    Posts
    1,436

    Default

    I generate web pages from a template held in a database.

    Do I need to have meta tags for keywords and description? Are these pages cached and searchable?

    Thanks...

    www.bricksandbrass.co.uk
    Simon
    ------------------------------
    www.tlaconsultancy.co.uk
    www.bricksandbrass.co.uk

  2. #2
    Join Date
    Sep 2000
    Location
    London, UK
    Posts
    1,436

    Default

    I generate web pages from a template held in a database.

    Do I need to have meta tags for keywords and description? Are these pages cached and searchable?

    Thanks...

    www.bricksandbrass.co.uk
    Simon
    ------------------------------
    www.tlaconsultancy.co.uk
    www.bricksandbrass.co.uk

  3. #3
    Join Date
    Aug 2000
    Location
    NS Canada
    Posts
    212

    Default

    Hi Simon ....

    My understanding is, no, dynamically generated pages are not crawled by spiders. But it depends on how your pages are created. For example, if they are inside a frameset, put the meta tags/descriptions inside these. You can list the links inside the NOFRAMES tags and these can be crawled.

    Some resources:

    WebMedic(right at the bottom): http://www.northernwebs.com/set/setsimjr.html

    SearchEngineWatch: http://www.searchenginewatch.com/webmasters/index.html

    Web Promote: http://webpromote.com/

    cfn ... Jen
    Jen Worden
    Web Developer
    www.meadoworks.com
    cfn ... Jen

    Jen Worden
    Web Developer
    www.meadoworks.com

  4. #4
    Join Date
    Aug 2000
    Location
    Ingolstadt, Germany
    Posts
    358

    Default



    Jen wrote:

    <blockquote>

    My understanding is, no, dynamically generated pages are not crawled by spiders. But it depends on how your pages are created.</p></blockquote>

    The way it depends is on the URL. The search engines have no way of knowing whether the page was dynamically created or just pulled off the hard disc - by the time they receive it, it's all just plain HTML.</p>

    What they do look at is the URL linked to. If there are query parameters in it, eg. http://www.example.com/readarticle.cgi?article=123 , they are much less likely to crawl it (although some will anyway). For this reason many people use various URL-transforming techniques to have a publically-accessible URL that looks like http://www.example.com/articles.cgi/123 instead, for pages that ought to be indexed. I don't know what approaches are available on your server, Simon, but typically mod_rewrite or CGI ATH_INFO are used.</p>

    meta-keywords may help a little in search engine indexing, but not very much these days - too many people abused it, so many engines just ignore it now. (content="sex, sex, sex, real estate, Xara X, Pamela Anderson...") meta-description is still helpful for users to see in results.</p>

    Jen's right about the noframes content too - if your pages aren't crawlable by users without frames, they won't be crawlable for (most) search engines either.</p>

  5. #5
    Join Date
    Sep 2000
    Location
    London, UK
    Posts
    1,436

    Default

    I do use frames, but I do have a robots.htm which lists all the files which I want crawled.

    Most of the site is static, but things like the bibliography, events, glossary and directory of companies are in MySQL, with the pages generated using PHP; essentially I have a template table, and the script which builds the page gets the template and does a find/replace where content is to go.

    With the directory, I do have some pages which are specially for the search spiders - these are not in the navigation although a human visitor will get pointed in the right direction.

    So I think all is OK - unless the spiders have given up on the keywords anyway!

    Thanks.

    www.bricksandbrass.co.uk
    Simon
    ------------------------------
    www.tlaconsultancy.co.uk
    www.bricksandbrass.co.uk

  6. #6
    Join Date
    Aug 2000
    Location
    NS Canada
    Posts
    212

    Default

    Simon, this came up on evolt today (thelist) and looked like it might be just what you were looking for:
    http://spider-food.net/dynamic-page-optimization.html

    cfn ... Jen
    Jen Worden
    Web Developer
    www.meadoworks.com
    cfn ... Jen

    Jen Worden
    Web Developer
    www.meadoworks.com

  7. #7
    Join Date
    Sep 2000
    Location
    London, UK
    Posts
    1,436

    Default

    Jen

    That covers it - although if Google and Hotbot are beginning to trace through dynamic pages...

    www.bricksandbrass.co.uk
    Simon
    ------------------------------
    www.tlaconsultancy.co.uk
    www.bricksandbrass.co.uk

  8. #8
    Join Date
    Aug 2000
    Location
    Ingolstadt, Germany
    Posts
    358

    Default

    Simon,

    I'd be *very* wary of using a 'robots.htm' file like that. Search engines are always on the lookout for 'cheating' techniques intended to make spiders behave differently to people, and your empty links might look just like that. Google sometimes punishes what it sees as 'cheating' with a zero PageRank, which you probably don't want.

    Why not just put the links to all your pages as a sitemap in the <noframes> section of index1? Not only will this placate any cheating-detection algorithms, but it'll mean non-frames or non-JavaScript users will at least be able to read your pages, instead of just getting a link to a page telling them to get lost.

    [This message was edited by Jen Worden on March 11, 2002 at 15:17.]

  9. #9
    Join Date
    Sep 2000
    Location
    London, UK
    Posts
    1,436

    Default

    I'm muddled now!

    I am using the robots.htm file in the way I have believed was normal practice ie to give the path to all the htm files that a spider should visit. Is this correct?

    And on the links pages, they are visitable by a human user and they contain a bit of text, plus a redirect to the database php script which is the entry to that bit of the site. Is this ok?

    Thanks to everyone on this.

    www.bricksandbrass.co.uk
    Simon
    ------------------------------
    www.tlaconsultancy.co.uk
    www.bricksandbrass.co.uk

  10. #10
    Join Date
    Aug 2000
    Location
    NS Canada
    Posts
    212

    Default

    Hi Simon ...

    I think you mean a robots.txt file (as opposed to an .html file), yes? In which case all you require is an open "invitation" as its function is actually for the opposite - which directories/files you don't want crawled.

    Syntax:
    All robots will spider the domain
    User-agent: *
    Disallow:

    # Disallow directory /cgi-bin/
    User-agent: *
    Disallow: /cgi-bin/

    # Disallow directory /i/
    User-agent: *
    Disallow: /i/

    Where you have text on the links pages, I think you've already ensured that they are crawlable (?!)

    cfn ... Jen

    Jen Worden
    Web Developer
    www.meadoworks.com
    cfn ... Jen

    Jen Worden
    Web Developer
    www.meadoworks.com

 

 

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •