代码语言
.
CSharp
.
JS
Java
Asp.Net
C
MSSQL
PHP
Css
PLSQL
Python
Shell
EBS
ASP
Perl
ObjC
VB.Net
VBS
MYSQL
GO
Delphi
AS
DB2
Domino
Rails
ActionScript
Scala
代码分类
文件
系统
字符串
数据库
网络相关
图形/GUI
多媒体
算法
游戏
Jquery
Extjs
Android
HTML5
菜单
网页交互
WinForm
控件
企业应用
安全与加密
脚本/批处理
开放平台
其它
【
Python
】
Python viterbi算法
作者:
Sephiroth
/ 发布于
2011/1/10
/
886
<div>def InitDicForViterbi(nodes,posw,posdi,n): newWordList = [] # 解决未登录词 for i in nodes: if not posw.has_key(i): newWordList.append(i) maxPosList = NPos(posdi,n) print maxPosList GenerateDicPos(newWordList,maxPosList,posw,posdi) def InitViterbi(node,posw,posdi): viStatePath = [] for i in posw[node]: if i <> "@@@": print "i:",i viStatePath.append([posw[node][i]*posdi[i]["@@@"],[i]]) return viStatePath</div> <div>""" nodes 就是分好的词, posw 是词转换为词性的概率,posdi是词性之间的转换概率,n 是n个最大的词性将此用于未登录词中, weightNone是未出现的词性转移的概率 nodes format {word1,word2...} , posw format {word1:{pos1:fre,pos2:fre,..."@@@":totalnum},..."@@@":total} posdi format {pos1:{pos2:fre,pos3:fre...."@@@":total},...."@@@":total} """ def Viterbi(nodes,posw,posdi,n,weightNone): InitDicForViterbi(nodes,posw,posdi,n) viStatePath = InitViterbi(nodes[0],posw,posdi) length = len(nodes) currentNode = 1 while currentNode < length: currentPosList = posw[nodes[currentNode]] paths = [] # print "vstate:",viStatePath for k in currentPosList: if k <> "@@@": ajk = weightNone heap = [] for j in xrange(len(viStatePath)): # compute every state j to every state k in ti # temppath = viStatePath[j] # print "lastpos:",temppath lastpos = viStatePath[j][1][-1] lastweight = viStatePath[j][0] lastposList = posdi[lastpos] if lastposList.has_key(k): ajk =lastposList[k] currentweight = lastweight * ajk * currentPosList[k] # print "viStatePath:",viStatePath[j][1] pathNew = [data for data in viStatePath[j][1]] pathNew.append(k) # print "pathNew:",pathNew heappush(heap,[currentweight,pathNew]) # get the max possibility of state k in ti # print "path:",path paths.append(nlargest(1,heap)[0]) del viStatePath viStatePath = paths # print "paths:",paths currentNode = currentNode + 1 heap = [] # get the max possibility path for i in viStatePath: heappush(heap,i) return nlargest(1,heap)</div> <div></div> <div>"""nodes format {word1,word2,...} path is [weight,[pos1,pos2....]]""" def Result(nodes,path,edcode="utf-8"): realPath = path[0][1] ResultPrint(nodes,realPath,edcode) """nodes format {word1,word2,...} path is [pos1,pos2....]""" def ResultPrint(nodes,path,edcode="utf-8"): for i in xrange(len(nodes)): print nodes[i].decode(edcode),"/",path[i].decode(edcode)</div> <div></div> <div></div> <div></div> <div>aa = ConvertGBKtoUTF("球球") bb = ConvertGBKtoUTF("娃娃") cc = ConvertGBKtoUTF("吃饭") dd = ConvertGBKtoUTF("好") ee = ConvertGBKtoUTF("dddwieoewkem") dictions = {aa:{bb:1,"@@@":4},bb:{cc:2,aa:3,"@@@":40},"@@@":400} posdi = {"n":{"s":3,"v":3,"@@@":40},"s":{"v":2,"e":3,"@@@":33},"v":{"@@@":1},"@@@":100} posw = {aa:{"n":1,"v":3,"@@@":29},bb:{"n":1,"s":1,"@@@":19},cc:{"n":1,"@@@":1},"@@@":10002} nodes = [aa,bb,cc,ee,dd,bb,aa,cc] path = Viterbi(nodes,posw,posdi,3,0.01) print path Result(nodes,path)</div> <div></div>
试试其它关键字
viterbi
同语言下
.
比较两个图片的相似度
.
过urllib2获取带有中文参数的url内容
.
不下载获取远程图片的宽度和高度及文件大小
.
通过qrcode库生成二维码
.
通过httplib发送GET和POST请求
.
Django下解决小文件下载
.
遍历windows的所有窗口并输出窗口标题
.
根据窗口标题调用窗口
.
python 抓取搜狗指定公众号
.
pandas读取指定列
可能有用的
.
C#实现的html内容截取
.
List 切割成几份 工具类
.
SQL查询 多列合并成一行用逗号隔开
.
一行一行读取txt的内容
.
C#动态修改文件夹名称(FSO实现,不移动文件)
.
c# 移动文件或文件夹
.
c#图片添加水印
.
Java PDF转换成图片并输出给前台展示
.
网站后台修改图片尺寸代码
.
处理大图片在缩略图时的展示
Sephiroth
贡献的其它代码
(
13
)
.
Python 最长公共子串算法
.
Python 126邮箱自动登录程序
.
Python 实现enum的功能
.
Python viterbi算法
.
将阿拉伯数字转换为罗马数字
.
Python 写入数据到MP3文件中
.
Python 调用默认浏览器
.
Python 使用xlrd读取Excel格式文件
.
PycURL 自动处理cookie
.
PycURL 实现POST方法
Copyright © 2004 - 2024 dezai.cn. All Rights Reserved
站长博客
粤ICP备13059550号-3