代码语言
.
CSharp
.
JS
Java
Asp.Net
C
MSSQL
PHP
Css
PLSQL
Python
Shell
EBS
ASP
Perl
ObjC
VB.Net
VBS
MYSQL
GO
Delphi
AS
DB2
Domino
Rails
ActionScript
Scala
代码分类
文件
系统
字符串
数据库
网络相关
图形/GUI
多媒体
算法
游戏
Jquery
Extjs
Android
HTML5
菜单
网页交互
WinForm
控件
企业应用
安全与加密
脚本/批处理
开放平台
其它
【
Python
】
apriori算法
作者:
sherlockzoom
/ 发布于
2015/1/4
/
894
""" Description : Simple Python implementation of the Apriori Algorithm Usage: $python apriori.py -f DATASET.csv -s minSupport -c minConfidence $python apriori.py -f DATASET.csv -s 0.15 -c 0.6 """ import sys from itertools import chain, combinations from collections import defaultdict from optparse import OptionParser def subsets(arr): """ Returns non empty subsets of arr""" return chain(*[combinations(arr, i + 1) for i, a in enumerate(arr)]) def returnItemsWithMinSupport(itemSet, transactionList, minSupport, freqSet): """calculates the support for items in the itemSet and returns a subset of the itemSet each of whose elements satisfies the minimum support""" _itemSet = set() localSet = defaultdict(int) for item in itemSet: for transaction in transactionList: if item.issubset(transaction): freqSet[item] += 1 localSet[item] += 1 for item, count in localSet.items(): support = float(count)/len(transactionList) if support >= minSupport: _itemSet.add(item) return _itemSet def joinSet(itemSet, length): """Join a set with itself and returns the n-element itemsets""" return set([i.union(j) for i in itemSet for j in itemSet if len(i.union(j)) == length]) def getItemSetTransactionList(data_iterator): transactionList = list() itemSet = set() for record in data_iterator: transaction = frozenset(record) transactionList.append(transaction) for item in transaction: itemSet.add(frozenset([item])) # Generate 1-itemSets return itemSet, transactionList def runApriori(data_iter, minSupport, minConfidence): """ run the apriori algorithm. data_iter is a record iterator Return both: - items (tuple, support) - rules ((pretuple, posttuple), confidence) """ itemSet, transactionList = getItemSetTransactionList(data_iter) freqSet = defaultdict(int) largeSet = dict() # Global dictionary which stores (key=n-itemSets,value=support) # which satisfy minSupport assocRules = dict() # Dictionary which stores Association Rules oneCSet = returnItemsWithMinSupport(itemSet, transactionList, minSupport, freqSet) currentLSet = oneCSet k = 2 while(currentLSet != set([])): largeSet[k-1] = currentLSet currentLSet = joinSet(currentLSet, k) currentCSet = returnItemsWithMinSupport(currentLSet, transactionList, minSupport, freqSet) currentLSet = currentCSet k = k + 1 def getSupport(item): """local function which Returns the support of an item""" return float(freqSet[item])/len(transactionList) toRetItems = [] for key, value in largeSet.items(): toRetItems.extend([(tuple(item), getSupport(item)) for item in value]) toRetRules = [] for key, value in largeSet.items()[1:]: for item in value: _subsets = map(frozenset, [x for x in subsets(item)]) for element in _subsets: remain = item.difference(element) if len(remain) > 0: confidence = getSupport(item)/getSupport(element) if confidence >= minConfidence: toRetRules.append(((tuple(element), tuple(remain)), confidence)) return toRetItems, toRetRules def printResults(items, rules): """prints the generated itemsets and the confidence rules""" for item, support in items: print "item: %s , %.3f" % (str(item), support) print "\n------------------------ RULES:" for rule, confidence in rules: pre, post = rule print "Rule: %s ==> %s , %.3f" % (str(pre), str(post), confidence) def dataFromFile(fname): """Function which reads from the file and yields a generator""" file_iter = open(fname, 'rU') for line in file_iter: line = line.strip().rstrip(',') # Remove trailing comma record = frozenset(line.split(',')) yield record if __name__ == "__main__": optparser = OptionParser() optparser.add_option('-f', '--inputFile', dest='input', help='filename containing csv', default=None) optparser.add_option('-s', '--minSupport', dest='minS', help='minimum support value', default=0.15, type='float') optparser.add_option('-c', '--minConfidence', dest='minC', help='minimum confidence value', default=0.6, type='float') (options, args) = optparser.parse_args() inFile = None if options.input is None: inFile = sys.stdin elif options.input is not None: inFile = dataFromFile(options.input) else: print 'No dataset filename specified, system with exit\n' sys.exit('System will exit') minSupport = options.minS minConfidence = options.minC items, rules = runApriori(inFile, minSupport, minConfidence) printResults(items, rules)
试试其它关键字
apriori
同语言下
.
比较两个图片的相似度
.
过urllib2获取带有中文参数的url内容
.
不下载获取远程图片的宽度和高度及文件大小
.
通过qrcode库生成二维码
.
通过httplib发送GET和POST请求
.
Django下解决小文件下载
.
遍历windows的所有窗口并输出窗口标题
.
根据窗口标题调用窗口
.
python 抓取搜狗指定公众号
.
pandas读取指定列
可能有用的
.
C#实现的html内容截取
.
List 切割成几份 工具类
.
SQL查询 多列合并成一行用逗号隔开
.
一行一行读取txt的内容
.
C#动态修改文件夹名称(FSO实现,不移动文件)
.
c# 移动文件或文件夹
.
c#图片添加水印
.
Java PDF转换成图片并输出给前台展示
.
网站后台修改图片尺寸代码
.
处理大图片在缩略图时的展示
sherlockzoom
贡献的其它代码
(
1
)
.
apriori算法
Copyright © 2004 - 2024 dezai.cn. All Rights Reserved
站长博客
粤ICP备13059550号-3