终于在做导入的时候遇到了GAE的瓶颈
Sunday, Jan.. 11. 2009 –
Category: GAE 开发 –
6 Comments
Tags:
Google App Engine
Google App Engine
在执行大量文章的导入的过程中 GAE不堪重负 吼一声Deadline Exceed Error便挂掉了...
GAE的限制吧 每个request要在一定时间内完成...
为了解决这个问题... 开搜...
http://groups.google.com/group/google-appengine/browse_thread/thread/e37dcd0f38a2f96a?hl=en&pli=1
gae group的一个讨论... 里面涉及了很多datastore优化的要点...
If you only need some of the properties for the query that needs 100+
results, you'll need to create a separate set of entities with just those
properties, and query those. Similarly, if you want the query to return
just the keys, you'll need entities containing the properties that are the
subjects of query filters and the keys for the full entities.100+ entities in a single request is a lot, especially with 40 properties on
If you're trying to deliver the results all at once like in a downloadable
each entity. Smaller entities will get() faster, but you might also
consider avoiding needing so many results at once. If you're hitting request
timeouts and really need that much data, you could spread the requests
across multiple requests using JavaScript. This won't reduce the total user
time for the complete result set, but you could reduce perceived latency by
displaying the first 20 results immediately while the remaining 80 are
fetched.
spreadsheet, you'll have to get clever, maybe use memcache as a workspace
and build it over multiple requests.
maybe use memcache as a workspace
and build it over multiple requests.
靠谱的解决方案 ... 解析出来的数据分成份放在memcache里 ..
然后像GAE unit 一样做ajax loop ...
还真不好整呀 - - !
6 Response to “终于在做导入的时候遇到了GAE的瓶颈”
Leave a Reply
Logo
About Me
-
A Computer Geek in Beijing, China. Focus on Web2.0 Technology: Google App Engine, Python, Django, Software Architecture, Agile, JAVA, J2EE, JavaScript, etc.
Coding for fun, Coding with passion :-) It's my life!
Most Popular Posts
- 1. GAE限制续 (2045)
- 2. Eclipse Google Plugin安装指南 (1905)
- 3. iHere Blog 1.0.2 安装配置 (1756)
- 4. iHere Blog 安装 简要配置 (1459)
- 5. 终于在做导入的时候遇到了GAE的瓶颈 (1194)
- 6. 新加Ajax效果Page flow (1060)
- 7. GAE上面的Unittest总结 (1049)
- 8. Web Python IDE Py I/O release! (1023)
- 9. 新东西 呵呵 JS3D (1006)
- 10. 转向了Appengine patch (993)
Tags
-
App Engine
Appengine patch
Django
Google
Google App Engine
Google App Engine
Java
algorithm
api
app
appengine
autodiscovery
blog
cache
chat
cloud computing
cron jobs
datastore
demo
feature
fetion
fridge
gae
geo
google
google app engine
google docs
googlemaps
iHere Blog History
ide
ihere
inforsphere
install
java
jquery
map
mashup
memcache
metaweblog
new
nutch
open source
pageflow
plugin
projects
pyio
python
quota
release
released
rss
sdk
snap
sort
topStory
twitter
weblog api
杂记


10 months, 2 weeks ago IP:89.124....
我也是由于这个原因放弃了导入wordpress的xml格式,取而代之我是用python脚本一篇一个request方式导入的。
10 months, 2 weeks ago IP:202.108...
恩 这样也是个好主意 ... 其实自己用够用了...现在wordpress的xml格式导入其实已经做完了..
就是要多导入几次,会自动by pass 以前的结果,导入的时候每个步骤放在transaction里了 保证不会出乱子.. 另外发现 django signals挺耗费处理时间的,还有针对datastore entity group ,keys的机制做了优化...减少查询 采用直接用key, key_name get的方式
GAE几大硬伤啊
可能的解决办法:
8 months ago IP:123.6.1...
我600多k,wordpress导出的WXR,总是"500服务器纠结中... "
下面就看见南方公园了
8 months ago IP:202.108...
推荐使用本地脚本导入:cd 到apps\import_wxp\
执行import.py (import.py -h查看用法)
下面是个例子:
import.py -f c:/wordpress.xml -m evertobe@gmail.com -a inforsphere -s 6.latest.inforsphere.appspot.com
5 months, 3 weeks ago IP:166.111...
请看看这个报错:使用import.py -h的时候说找一些模块找不到? C:\Program Files\Google\google_appengine\ihere\apps\import_wxp>import.py -h Traceback (most recent call last): File "C:\Program Files\Google\google_appengine\ihere\apps\import_wxp\import.py ", line 26, in
init_env()
File "C:\Program Files\Google\google_appengine\ihere\apps\import_wxp\import.py
", line 10, in init_env
from appenginepatch.appenginepatcher.patch import patch_all, setup_logging
File "E:\Program Files\Google\google_appengine\ihere\common\appenginepatch\app
enginepatcher\patch.py", line 7, in
File "C:/Program Files/Google/google_appengine\google\appengine\ext\db\__init_
_.py", line 88, in
from google.appengine.api import datastore
File "C:/Program Files/Google/google_appengine\google\appengine\api\datastore.
py", line 47, in
from google.appengine.datastore import datastore_index
File "C:/Program Files/Google/google_appengine\google\appengine\datastore\data
store_index.py", line 53, in
from google.appengine.api import validation
File "C:/Program Files/Google/google_appengine\google\appengine\api\validation
.py", line 44, in
import yaml
ImportError: No module named yaml
5 months, 3 weeks ago IP:202.108...
yaml模块需要安装 这个在google的sdk里面有 你如果是windows的话进cmd, cd到: C:\Program Files\Google\google_appengine\lib\yaml 运行python setup.py install 然后就可以了