Python에서 HDF5 파일을 읽는 방법

programing

Python에서 HDF5 파일을 읽는 방법

newstyles 2023. 7. 23. 14:07

Python에서 HDF5 파일을 읽는 방법

파이썬에서 hdf5 파일의 데이터를 읽으려고 합니다.다음을 사용하여 hdf5 파일을 읽을 수 있습니다.h5py파일 내의 데이터에 액세스하는 방법을 알 수 없습니다.

내 코드

import h5py    
import numpy as np    
f1 = h5py.File(file_name,'r+')

이것은 작동하고 파일을 읽습니다.그러나 파일 개체 내의 데이터에 액세스하려면 어떻게 해야 합니까?f1?

HDF5 읽기

import h5py
filename = "file.hdf5"

with h5py.File(filename, "r") as f:
    # Print all root level object names (aka keys) 
    # these can be group or dataset names 
    print("Keys: %s" % f.keys())
    # get first object name/key; may or may NOT be a group
    a_group_key = list(f.keys())[0]

    # get the object type for a_group_key: usually group or dataset
    print(type(f[a_group_key])) 

    # If a_group_key is a group name, 
    # this gets the object names in the group and returns as a list
    data = list(f[a_group_key])

    # If a_group_key is a dataset name, 
    # this gets the dataset values and returns as a list
    data = list(f[a_group_key])
    # preferred methods to get dataset values:
    ds_obj = f[a_group_key]      # returns as a h5py dataset object
    ds_arr = f[a_group_key][()]  # returns as a numpy array

HDF5 쓰기

import h5py

# Create random data
import numpy as np
data_matrix = np.random.uniform(-1, 1, size=(10, 3))

# Write data to HDF5
with h5py.File("file.hdf5", "w") as data_file:
    data_file.create_dataset("dataset_name", data=data_matrix)

자세한 내용은 h5py 문서를 참조하십시오.

대안

JSON: 사람이 읽을 수 있는 데이터를 작성하는 데 적합합니다. 매우 일반적으로 사용됩니다(읽기 및 쓰기).
CSV: 매우 단순한 형식(읽기 및 쓰기)
피클: 파이썬 직렬화 형식(읽기 및 쓰기)
메시지 팩(Python 패키지):보다 간결한 표현(읽기 및 쓰기)
HDF5(파이썬 패키지):행렬(읽기 및 쓰기)에 적합
XML: 너무 *sight*(읽기 및 쓰기)가 있습니다.

응용 프로그램의 경우 다음이 중요할 수 있습니다.

다른 프로그래밍 언어에 의한 지원
읽기/쓰기 성능
압축도(파일 크기)

참고 항목:데이터 직렬화 형식 비교

구성 파일을 만드는 방법을 찾고 있는 경우에는 Python의 구성 파일에 대한 간단한 기사를 읽어보는 것이 좋습니다.

파일 읽기

import h5py

f = h5py.File(file_name, mode)

어떤 HDF5 그룹이 있는지 인쇄하여 파일의 구조를 연구합니다.

for key in f.keys():
    print(key) #Names of the root level object names in HDF5 file - can be groups or datasets.
    print(type(f[key])) # get the object type: usually group or dataset

데이터 추출

#Get the HDF5 group; key needs to be a group name from above
group = f[key]

#Checkout what keys are inside that group.
for key in group.keys():
    print(key)

# This assumes group[some_key_inside_the_group] is a dataset, 
# and returns a np.array:
data = group[some_key_inside_the_group][()]
#Do whatever you want with data

#After you are done
f.close()

판다를 사용할 수 있습니다.

import pandas as pd
pd.read_hdf(filename,key)

save_weights 함수에 의해 케라로 생성된 .hdf5 파일을 읽고 레이어 이름과 가중치가 있는 딕트를 반환하는 간단한 함수는 다음과 같습니다.

def read_hdf5(path):

    weights = {}

    keys = []
    with h5py.File(path, 'r') as f: # open file
        f.visit(keys.append) # append all keys to list
        for key in keys:
            if ':' in key: # contains data if ':' in key
                print(f[key].name)
                weights[f[key].name] = f[key].value
    return weights

https://gist.github.com/Attila94/fb917e03b04035f3737cc8860d9e9f9b .

그것을 철저히 테스트하지는 않았지만, 나는 그 일을 합니다.

.hdf5 파일의 내용을 배열로 읽으려면 다음과 같은 작업을 수행합니다.

> import numpy as np 
> myarray = np.fromfile('file.hdf5', dtype=float)
> print(myarray)

아래 코드를 사용하여 데이터를 읽고 numpy 배열로 변환

import h5py
f1 = h5py.File('data_1.h5', 'r')
list(f1.keys())
X1 = f1['x']
y1=f1['y']
df1= np.array(X1.value)
dfy1= np.array(y1.value)
print (df1.shape)
print (dfy1.shape)

데이터 집합 값을 numpy 배열로 읽는 기본 방법:

import h5py
# use Python file context manager:
with h5py.File('data_1.h5', 'r') as f1:
    print(list(f1.keys()))  # print list of root level objects
    # following assumes 'x' and 'y' are dataset objects
    ds_x1 = f1['x']  # returns h5py dataset object for 'x'
    ds_y1 = f1['y']  # returns h5py dataset object for 'y'
    arr_x1 = f1['x'][()]  # returns np.array for 'x'
    arr_y1 = f1['y'][()]  # returns np.array for 'y'
    arr_x1 = ds_x1[()]  # uses dataset object to get np.array for 'x'
    arr_y1 = ds_y1[()]  # uses dataset object to get np.array for 'y'
    print (arr_x1.shape)
    print (arr_y1.shape)

from keras.models import load_model 

h= load_model('FILE_NAME.h5')

hdf 파일에서 데이터셋의 이름을 지정한 경우 다음 코드를 사용하여 이러한 데이터셋을 numpy 어레이에서 읽고 변환할 수 있습니다.

import h5py
file = h5py.File('filename.h5', 'r')

xdata = file.get('xdata')
xdata= np.array(xdata)

파일이 다른 디렉터리에 있는 경우 앞에 경로를 추가할 수 있습니다.'filename.h5'.

데이터 세트를 생성해야 합니다.퀵스타트 가이드를 보시면 데이터셋을 만들기 위해서는 파일 객체를 사용해야 한다는 것을 알 수 있습니다.f.create_dataset그러면 데이터를 읽을 수 있습니다.이것은 문서에 설명되어 있습니다.

이 질문과 최신 문서의 답변을 사용하여, 저는 다음을 사용하여 제 수치 배열을 추출할 수 있었습니다.

import h5py
with h5py.File(filename, 'r') as h5f:
    h5x = h5f[list(h5f.keys())[0]]['x'][()]

에▁where디'x'제 경우에는 단순히 X 좌표입니다.

이것을 사용하면 나에게 잘 작동합니다.


    weights = {}

    keys = []
    with h5py.File("path.h5", 'r') as f: 
        f.visit(keys.append) 
        for key in keys:
            if ':' in key: 
                print(f[key].name)     
                weights[f[key].name] = f[key][()]
    return weights

print(read_hdf5())

만약 당신이 h5py<='2.9.0'을 사용한다면, 당신은 사용할 수 있습니다.


    weights = {}

    keys = []
    with h5py.File("path.h5", 'r') as f: 
        f.visit(keys.append) 
        for key in keys:
            if ':' in key: 
                print(f[key].name)     
                weights[f[key].name] = f[key].value
    return weights

print(read_hdf5())

나는 h5py의 포장지를 추천합니다 , 당신이 hdf5 데이터를 쉽게 로드할 수 있게 해줍니다 , 다음과 같은 속성들을 통해.group.dataset함)group['dataset'] 탭 (IPython/Jupiter 탭료다니습었되이완▁(다j니yip.

코드는 여기 있습니다.다음은 몇 가지 사용 예입니다. 아래 코드를 직접 사용해 보십시오.

# create example HDF5 file for this guide
import h5py, io
file = io.BytesIO()
with h5py.File(file, 'w') as fp:
    fp['0'] = [1, 2]
    fp['a'] = [3, 4]
    fp['b/c'] = 5
    fp.attrs['d'] = 's'

# import package
from h5attr import H5Attr

# open file
f = H5Attr(file)

# easy access to members, with tab completion in IPython/Jupyter
f.a, f['a']

# also work for subgroups, but note that f['b/c'] is more efficient
# because it does not create f['b']
f.b.c, f['b'].c, f['b/c']

# access to HDF5 attrs via a H5Attr wrapper
f._attrs.d, f._attrs['d']

# show summary of the data
f._show()
# 0   int64 (2,)
# a   int64 (2,)
# b/  1 members

# lazy (default) and non-lazy mode
f = H5Attr(file)
f.a  # <HDF5 dataset "a": shape (2,), type "<i8">

f = H5Attr(file, lazy=False)
f.a  # array([3, 4])

언급URL : https://stackoverflow.com/questions/28170623/how-to-read-hdf5-files-in-python

'programing' 카테고리의 다른 글

Oracle 10g을 사용할 때 Hibernate의 부동 소수점 열 스키마 유효성 검사와 관련하여 알려진 문제에 대한 가장 좋은 해결 방법은 무엇입니까? (0)	2023.07.23
BIGINT(8)는 MySQL이 저장할 수 있는 가장 큰 정수입니까? (0)	2023.07.23
Cortex-A9에서 TLB 효과 측정 (0)	2023.07.23
URL 시작 부분의 문자열 제거 (0)	2023.07.23
Python에서 데이터베이스 연결 시간 초과 설정 (0)	2023.07.23

현재글Python에서 HDF5 파일을 읽는 방법

각종 프로그래밍 정보를 다루는 블로그입니다.

ajax, PowerShell, ASP.NET, mariadb, ReactJS, javascript, oracle, jQuery, MongoDB, GIT, python, Excel, Android, C, sql-server, wordpress, JSON, mysql, AngularJS, spring-boot,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

newstyle

Python에서 HDF5 파일을 읽는 방법

Python에서 HDF5 파일을 읽는 방법

내 코드

HDF5 읽기

HDF5 쓰기

대안

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

Python에서 HDF5 파일을 읽는 방법

Python에서 HDF5 파일을 읽는 방법

내 코드

HDF5 읽기

HDF5 쓰기

대안

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바