Loki和grafana集成问题

问题描述

在使用helm部署loki-stack的时候,遇到连接loki的数据源出现失败的情况

loki-stack 版本 2.10.2

img

查看log发现如下报错

logger=tsdb.loki endpoint=CheckHealth t=2024-04-04T11:56:44.985411573Z level=error msg="Loki health check failed" 
error="error from loki: parse error at line 1, col 1: syntax error: unexpected IDENTIFIER"
logger=context userId=1 orgId=1 uname=admin t=2024-04-04T11:56:44.985485641Z 
level=info msg="Request Completed" method=GET path=/api/datasources/uid/ddhj6qxw5g7pcc/health 
status=400 remote_addr=116.3.95.227 time_ms=53 duration=53.253649ms size=106 referer=https://grafana.lazytoki.cn/connections/datasources/edit/ddhj6qxw5g7pcc handler=/api/datasources/uid/:uid/health status_source=server


logger=tsdb.loki endpoint=checkHealth pluginId=loki dsName=loki dsUID=ddhj6qxw5g7pcc uname=admin fromAlert=false t=2024-04-04T11:56:44.985183092Z 
level=error msg="Error received from Loki" duration=51.467899ms stage=databaseRequest statusCode=400 contentLength=65 start=1970-01-01T00:00:01Z end=1970-01-01T00:00:04Z
 step=1s query=vector(1)+vector(1) queryType=instant direction=backward maxLines=0 supportingQueryType=none lokiHost=loki.loki:3100 lokiPath=/loki/api/v1/query 
 status=error error="parse error at line 1, col 1: syntax error: unexpected IDENTIFIER" statusSource=downstream



level=warn ts=2024-04-02T15:27:53.199405593Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=warn ts=2024-04-02T15:27:53.931125608Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=warn ts=2024-04-02T15:27:54.990545589Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=warn ts=2024-04-02T15:27:58.984426364Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=warn ts=2024-04-02T15:28:05.065678857Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=warn ts=2024-04-02T15:28:16.949318979Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=warn ts=2024-04-02T15:28:48.663490275Z caller=client.go:419 component=client host=loki:3100 msg="error sending batch, will retry" status=-1 tenant= error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp 10.96.0.245:3100: connect: connection refused"
level=error ts=2024-04-02T15:29:37.729581567Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"openebs\", component=\"ndm\", container=\"openebs-ndm\", filename=\"/var/log/pods/openebs_openebs-ndm-lxjk9_b82b582f-34e5-42e0-9ce7-4705d2b6ecdf/openebs-ndm/0.log\", job=\"openebs/openebs\", namespace=\"openebs\", node_name=\"home-local-worker04\", pod=\"openebs-ndm-lxjk9\", stream=\"stdout\"}' has timestamp too old: 2023-12-14T12:45:25Z, oldest acceptable timestamp is: 2024-03-26T15:29:37Z"
level=error ts=2024-04-02T15:29:37.88113023Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-15T14:47:54Z, oldest acceptable timestamp is: 2024-03-26T15:29:37Z"
level=error ts=2024-04-02T15:29:38.055066885Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-06T13:13:54Z, oldest acceptable timestamp is: 2024-03-26T15:29:38Z"
level=error ts=2024-04-02T15:29:38.222703756Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-06T18:07:54Z, oldest acceptable timestamp is: 2024-03-26T15:29:38Z"
level=error ts=2024-04-02T15:29:38.395145251Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-15T17:47:58Z, oldest acceptable timestamp is: 2024-03-26T15:29:38Z"
level=error ts=2024-04-02T15:29:38.573435913Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-07T04:29:44Z, oldest acceptable timestamp is: 2024-03-26T15:29:38Z"
level=error ts=2024-04-02T15:29:38.759698035Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-15T19:52:57Z, oldest acceptable timestamp is: 2024-03-26T15:29:38Z"
level=error ts=2024-04-02T15:29:38.928628249Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-15T20:57:08Z, oldest acceptable timestamp is: 2024-03-26T15:29:38Z"
level=error ts=2024-04-02T15:29:39.103108307Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-15T21:59:28Z, oldest acceptable timestamp is: 2024-03-26T15:29:39Z"
level=error ts=2024-04-02T15:29:39.280034691Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-08T00:33:09Z, oldest acceptable timestamp is: 2024-03-26T15:29:39Z"
level=error ts=2024-04-02T15:29:39.455194861Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-08T06:24:59Z, oldest acceptable timestamp is: 2024-03-26T15:29:39Z"
level=error ts=2024-04-02T15:29:39.62331309Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-12T00:14:42Z, oldest acceptable timestamp is: 2024-03-26T15:29:39Z"
level=error ts=2024-04-02T15:29:39.785989489Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-16T02:02:08Z, oldest acceptable timestamp is: 2024-03-26T15:29:39Z"
level=error ts=2024-04-02T15:29:39.948559076Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"ingress-nginx\", component=\"controller\", container=\"controller\", filename=\"/var/log/pods/ingress-nginx_ingress-nginx-controller-lwwcf_1095c76b-affb-44b6-b4cb-107b2b9cf2e5/controller/0.log\", instance=\"ingress-nginx\", job=\"ingress-nginx/ingress-nginx\", namespace=\"ingress-nginx\", node_name=\"home-local-worker04\", pod=\"ingress-nginx-controller-lwwcf\", stream=\"stderr\"}' has timestamp too old: 2024-03-15T19:23:00Z, oldest acceptable timestamp is: 2024-03-26T15:29:39Z"
level=error ts=2024-04-02T15:29:40.105307557Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-16T05:20:14Z, oldest acceptable timestamp is: 2024-03-26T15:29:40Z"
level=error ts=2024-04-02T15:29:41.229307824Z caller=client.go:430 component=client host=loki:3100 msg="final error sending batch" status=400 tenant= error="server returned HTTP status 400 Bad Request (400): entry for stream '{app=\"calico-node\", container=\"calico-node\", filename=\"/var/log/pods/kube-system_calico-node-9qkn9_d94f23b0-91a3-4af5-833a-6709ac0d983a/calico-node/0.log\", job=\"kube-system/calico-node\", namespace=\"kube-system\", node_name=\"home-local-worker04\", pod=\"calico-node-9qkn9\", stream=\"stdout\"}' has timestamp too old: 2024-03-16T08:14:16Z, oldest acceptable timestamp is: 2024-03-26T15:29:41Z"
root@tencent-beijing-master:~# 

问题解决

The subchart needs updating for this too. I am hitting issues on newer versions of Grafana because the Loki version is so out-of-date. The latest release, I believe, is 2.9.5. I had to use a hack to manually set the image repository and tag in the config to get this to work.

查阅loki的github,找到一个一样的issue,简而言之是我目前部署的loki-stack中,loki的镜像是固定的tag,使用的是2.6.1,如果使用最新的grafana,会出现版本兼容问题,需要将其手动更改到2.9.3或者最新版本。

https://github.com/grafana/loki/issues/11557

Even though I updated to the latest Helm chart version targeting Loki 2.9.3, I had to make a change in the values.yaml to set the image tag for Loki, as on inspection of the pods it was still running version 2.6.1. This fixed my issue. See here for more information.

NB: This is a temporary hack until the subchart is updated to the latest version in Loki Stack.

##方法一
#if you describe your loki pod, what is the version of image?
#try getting loki values into file from helm chart by this command:
tar --extract --file=/root/.cache/helm/repository/loki-stack-2.10.2.tgz loki-stack/charts/loki/values.yaml -O > loki_values.yaml

#then in the loki_values.yaml file modify the image to 2.9.3:
cat loki_values.yaml |  grep -i "image:" -A2 -m1
image:
  repository: grafana/loki
  tag: 2.9.3
then upgrade your loki install:

helm upgrade --install  loki grafana/loki-stack --set resources.requests.cpu=100m --set resources.requests.memory=256Mi -f loki_values.yaml

#see if this makes it work
#ps. just to make sure your pods are getting the new config restart the pods:
kubectl rollout restart sts loki

https://github.com/grafana/loki/issues/11893

##方法二
That was the issue for me. After upgrading loki to 2.9.3, it works fine now.
 helm upgrade --install loki --namespace=monitoring-system grafana/loki-stack --set loki.image.tag=2.9.3
tag(s): Loki
show comments · back · home
Edit with markdown