Mod rewrite 从URL中删除SID并使用.htaccess重定向301

Mod rewrite 从URL中删除SID并使用.htaccess重定向301,mod-rewrite,magento,seo,Mod Rewrite,Magento,Seo,我们在谷歌搜索结果中有一些带有SID的URL,我们希望301重定向到没有SID的页面。所以我们需要重写URL来更改此URL http://www.in-due.de/hochzeitsshop/catalogsearch/result/index/?SID=8df077eea401bda0da7e9a980efe20cf&cat=148&dir=asc&limit=9&order=relevance&p=8&q=gold 进入此url: http

我们在谷歌搜索结果中有一些带有SID的URL,我们希望301重定向到没有SID的页面。所以我们需要重写URL来更改此URL

http://www.in-due.de/hochzeitsshop/catalogsearch/result/index/?SID=8df077eea401bda0da7e9a980efe20cf&cat=148&dir=asc&limit=9&order=relevance&p=8&q=gold
进入此url:

http://www.in-due.de/hochzeitsshop/catalogsearch/result/index/?cat=148&dir=asc&limit=9&order=relevance&p=8&q=gold
基本上删除此部分:

SID=8df077eea401bda0da7e9a980efe20cf&

有人能帮忙吗?

登录到谷歌的网站管理员工具,在“站点配置”下,参数处理您将希望将SID添加到列表中,并且您可以手动删除URL,但我只需使用这个robots.txt文件,让机器人拾取以会话ID删除这些URL

这是我用于Magento站点的robot.txt文件。显然,您可能需要根据需要进行调整:

# $Id: robots.txt,v magento-specific 2010/28/01 18:24:19 goba Exp $
#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

# Website Sitemap
Sitemap: http://www.yourdomain.com/sitemap.xml

# Crawlers Setup
User-agent: *
Crawl-delay: 10

# Allowable Index
Allow: /*?p=
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/

# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/

# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/

# Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt

# Paths (no clean URLs)
Disallow: /*.js$
Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?p=*&
Disallow: /*?SID=

登录到Google的网站管理员工具,在站点配置下,参数处理您将希望将SID添加到列表中,您可以手动删除URL,但我只需使用这个robots.txt文件,让机器人拾取并删除带有会话ID的URL

这是我用于Magento站点的robot.txt文件。显然,您可能需要根据需要进行调整:

# $Id: robots.txt,v magento-specific 2010/28/01 18:24:19 goba Exp $
#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

# Website Sitemap
Sitemap: http://www.yourdomain.com/sitemap.xml

# Crawlers Setup
User-agent: *
Crawl-delay: 10

# Allowable Index
Allow: /*?p=
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/

# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/

# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/

# Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt

# Paths (no clean URLs)
Disallow: /*.js$
Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?p=*&
Disallow: /*?SID=

我不会为此使用mod_rewrite,因为在这种情况下这样做太过分了。有时需要SID,不应将其从URL中删除

您可以按照B00MER的建议进行操作,并遵循谷歌制定的最佳实践:

例如,您可以将以下内容添加到该页眉:

robots.txt和规范URL的结合应该可以真正解决您可能存在的任何SEO问题


祝你好运

我不会为此使用mod_rewrite,因为在这种情况下这样做太过分了。有时需要SID,不应将其从URL中删除

您可以按照B00MER的建议进行操作,并遵循谷歌制定的最佳实践:

例如,您可以将以下内容添加到该页眉:

robots.txt和规范URL的结合应该可以真正解决您可能存在的任何SEO问题

祝你好运

好的,我明白你的观点(感谢你的解释)-但是现在在googls索引中有带SID的URL,这会导致404s-我想将它们重定向到正确的页面(没有SID的页面),只要它们出现在搜索中。好的,我明白你的观点(感谢你的解释)-但是现在在googls索引中有带SID的URL,这会导致404s-我想将它们重定向到正确的页面(没有SID的页面),只要它们在搜索中出现。